Longest Common Substring - Find Shared Span

Dynamic-programming substring search

The longest common substring is the longest sequence of characters that appears contiguously in both inputs. Unlike longest common subsequence, the matching characters must be adjacent on both sides. The algorithm fills a 2D dynamic-programming table where dp[i][j] is the length of the match ending at A[i-1] and B[j-1]; the maximum cell value is the answer, and its position locates the substring.

The split between text A and text B is a line containing exactly ---. Output is two lines: Longest common substring (N chars): followed by the matched text. N is the length in code units. If both inputs have characters but share none in common, N is 0 and the second line is empty.

Comparison is exact: case sensitive, whitespace included. Spaces, line breaks, and punctuation can all be part of the matched span. Memory is linear in the shorter side; the implementation uses two rolling arrays rather than a full n × m matrix, so very long inputs still complete in well under a second.

How to use longest common substring

1Paste text A into the input panel, then a line with ---, then text B.
2The longest matching span and its length appear in the output panel as you type.
3Click Copy to copy the matched substring with its header line.
4For overall similarity rather than a single span, see text similarity.
5For minimum edits between the two strings switch to Levenshtein distance.

Keyboard shortcuts

Drive TextResult without touching the mouse.

Shortcut	Action
`Ctrl` `F`	Open the find & replace panel inside the input Plus
`Ctrl` `Z`	Undo the last input change
`Ctrl` `Shift` `Z`	Redo
`Ctrl` `Shift` `Enter`	Toggle fullscreen focus on the editor Plus
`Esc`	Close find & replace, or exit fullscreen
`Ctrl` `K`	Open the command palette to jump to any tool Plus
`Ctrl` `S`	Save current workflow draft Plus
`Ctrl` `P`	Run a saved workflow Plus

How the search works

Dynamic-programming length table

The algorithm fills a table where dp[i][j] equals dp[i-1][j-1] + 1 when A[i-1] === B[j-1] and 0 otherwise. The maximum value across the table is the length of the longest common substring; its position locates the start.

Linear memory rolling arrays

Two arrays of length m + 1 are used as a rolling window over the table rather than a full matrix. Memory stays linear in the shorter side, so multi-kilobyte inputs still complete in well under a second.

Contiguous, case sensitive

Matched characters must be adjacent on both sides; gaps disqualify a span. Comparison is exact, so fox and Fox are different. Lowercase first with lowercase if you want case folding.

Whitespace and punctuation are content

Spaces, line breaks, and punctuation can be part of the matched span. brown fox jumps is a perfectly valid result, including the spaces.

Three-hyphen separator

The split between A and B is a line containing exactly ---. The tool requires this marker to know where A ends and B begins.

Worked example

The shared span brown fox jumps includes the spaces and is 16 code units long. For all common words instead of one contiguous span, see word set intersection.

Input

the quick brown fox jumps
---
a fast brown fox jumps over

Output

Longest common substring (16 chars):
brown fox jumps

Settings reference

Behaviour	Effect on output
Separator	A line containing exactly `---` splits text A from text B.
Algorithm	Dynamic programming with two rolling arrays.
Match rule	Contiguous, exact, case sensitive.
Length unit	UTF-16 code units (JavaScript `.length`).
Whitespace	Treated as content; spaces can be part of the matched span.
No common chars	Length 0 with an empty matched-span line.
Missing separator	Output prompts for two halves split by `---`.

FAQ

Is this the same as longest common subsequence?

No. Subsequence allows gaps; substring requires the matched characters to be contiguous on both sides. longest-common-substring is the contiguous version.

Is the search case sensitive?

Yes. Fox and fox do not match. Lowercase both halves first with lowercase if you want case-folded behaviour.

What is reported when there is more than one span of equal length?

The first one found by the dynamic-programming sweep is returned. The sweep walks A in outer order and B in inner order, so the first match by index in A wins.

How fast is it on long inputs?

Time is O(n × m), memory is O(min(n, m)) due to the rolling-array trick. Multi-kilobyte inputs complete well under a second on modern browsers.

How does this differ from Levenshtein distance?

Levenshtein counts the minimum edits to turn A into B. Longest common substring measures the largest unchanged span. Both use dynamic programming, but answer different questions.

Also known as

longest common substring find shared text span lcs substring finder common contiguous text matching block finder shared substring tool longest match between strings string overlap finder