Longest common substring

Longest common substring (LCS) finds the largest unbroken stretch of characters shared by two texts. Paste both into the input pane separated by a line of ---; the tool runs a dynamic-programming pass over the two strings and returns the matching span plus its length. The compute runs in your browser; nothing uploads. For an edit count instead see Levenshtein distance.

Input
Line 1:1 LF cloud_done Saved locally
Result Longest Common Substring
0 lines 0 chars

Dynamic-programming substring search

The longest common substring is the longest sequence of characters that appears contiguously in both inputs. Unlike longest common subsequence, the matching characters must be adjacent on both sides. The algorithm fills a 2D dynamic-programming table where dp[i][j] is the length of the match ending at A[i-1] and B[j-1]; the maximum cell value is the answer, and its position locates the substring.

The split between text A and text B is a line containing exactly ---. Output is two lines: Longest common substring (N chars): followed by the matched text. N is the length in code units. If both inputs have characters but share none in common, N is 0 and the second line is empty.

Comparison is exact: case sensitive, whitespace included. Spaces, line breaks, and punctuation can all be part of the matched span. Memory is linear in the shorter side; the implementation uses two rolling arrays rather than a full n × m matrix, so very long inputs still complete in well under a second.

How to use longest common substring

  1. 1Paste text A into the input panel, then a line with ---, then text B.
  2. 2The longest matching span and its length appear in the output panel as you type.
  3. 3Click Copy to copy the matched substring with its header line.
  4. 4For overall similarity rather than a single span, see text similarity.
  5. 5For minimum edits between the two strings switch to Levenshtein distance.

Keyboard shortcuts

Drive TextResult without touching the mouse.

Shortcut Action
Ctrl FOpen the find & replace panel inside the input Plus
Ctrl ZUndo the last input change
Ctrl Shift ZRedo
Ctrl Shift EnterToggle fullscreen focus on the editor Plus
EscClose find & replace, or exit fullscreen
Ctrl KOpen the command palette to jump to any tool Plus
Ctrl SSave current workflow draft Plus
Ctrl PRun a saved workflow Plus

How the search works

Dynamic-programming length table

The algorithm fills a table where dp[i][j] equals dp[i-1][j-1] + 1 when A[i-1] === B[j-1] and 0 otherwise. The maximum value across the table is the length of the longest common substring; its position locates the start.

Linear memory rolling arrays

Two arrays of length m + 1 are used as a rolling window over the table rather than a full matrix. Memory stays linear in the shorter side, so multi-kilobyte inputs still complete in well under a second.

Contiguous, case sensitive

Matched characters must be adjacent on both sides; gaps disqualify a span. Comparison is exact, so fox and Fox are different. Lowercase first with lowercase if you want case folding.

Whitespace and punctuation are content

Spaces, line breaks, and punctuation can be part of the matched span. brown fox jumps is a perfectly valid result, including the spaces.

Three-hyphen separator

The split between A and B is a line containing exactly ---. The tool requires this marker to know where A ends and B begins.

Worked example

The shared span brown fox jumps includes the spaces and is 16 code units long. For all common words instead of one contiguous span, see word set intersection.

Input
the quick brown fox jumps
---
a fast brown fox jumps over
Output
Longest common substring (16 chars):
brown fox jumps

Settings reference

Behaviour Effect on output
Separator A line containing exactly --- splits text A from text B.
Algorithm Dynamic programming with two rolling arrays.
Match rule Contiguous, exact, case sensitive.
Length unit UTF-16 code units (JavaScript .length).
Whitespace Treated as content; spaces can be part of the matched span.
No common chars Length 0 with an empty matched-span line.
Missing separator Output prompts for two halves split by ---.

FAQ

Is this the same as longest common subsequence?
No. Subsequence allows gaps; substring requires the matched characters to be contiguous on both sides. longest-common-substring is the contiguous version.
Is the search case sensitive?
Yes. Fox and fox do not match. Lowercase both halves first with lowercase if you want case-folded behaviour.
What is reported when there is more than one span of equal length?
The first one found by the dynamic-programming sweep is returned. The sweep walks A in outer order and B in inner order, so the first match by index in A wins.
How fast is it on long inputs?
Time is O(n × m), memory is O(min(n, m)) due to the rolling-array trick. Multi-kilobyte inputs complete well under a second on modern browsers.
How does this differ from Levenshtein distance?
Levenshtein counts the minimum edits to turn A into B. Longest common substring measures the largest unchanged span. Both use dynamic programming, but answer different questions.