Find Unique Words - List Distinct Words

Distinct word lists, sorted alphabetically

The tokenizer is \b[\w']+\b: any run of word characters with optional internal apostrophes. Hyphens are word boundaries here, so well-known contributes well and known as separate entries. The list is deduplicated and then sorted alphabetically before being joined with the chosen separator.

The output starts with a one-line header: Unique words: N, where N is the count of distinct entries after the case and mode filters. A blank line follows, then the joined list. With Separator = Newline (default) you get one word per line, ready to paste into a spreadsheet column. With Comma you get a comma-space separated string suitable for use as a tag list.

Mode = Distinct (default) lists every unique word once. Exactly Once lists only hapax legomena: words that appear in the source exactly one time. The latter is useful when proofreading for typos (a name spelled two ways will appear twice and so be excluded) or surfacing rare vocabulary.

How to use find every unique word

1Paste or type your text into the input panel on the left.
2The unique-word list appears in the output panel as you type.
3Switch Separator to Comma for a tag-style list, or Space for a single line.
4Turn on Match Case if Word and word should be different entries.
5Switch Mode to Exactly Once to filter to words that appear just once.

Keyboard shortcuts

Drive TextResult without touching the mouse.

Shortcut	Action
`Ctrl` `F`	Open the find & replace panel inside the input Plus
`Ctrl` `Z`	Undo the last input change
`Ctrl` `Shift` `Z`	Redo
`Ctrl` `Shift` `Enter`	Toggle fullscreen focus on the editor Plus
`Esc`	Close find & replace, or exit fullscreen
`Ctrl` `K`	Open the command palette to jump to any tool Plus
`Ctrl` `S`	Save current workflow draft Plus
`Ctrl` `P`	Run a saved workflow Plus

What this tool actually does

Tokenizes on `\b[\w']+\b`

Words are runs of \w with optional internal apostrophes. Hyphens are word boundaries here (different from word counter). Punctuation passes through as a boundary; numbers count as words.

Match Case toggle

Default off: the input is lowercased before tokenizing, so The and the collapse into a single entry. On: case is preserved and capitalised variants stay separate. Useful when proper nouns matter.

Mode = Distinct or Exactly Once

Default Distinct emits every unique word once. Exactly Once filters to words that appear in the source exactly one time (hapax legomena). The header line still shows Unique words: N for the filtered count.

Separator picks the joiner

Newline (default) produces one word per line. Comma produces word, word, word. Space produces a single space-joined line. The header (Unique words: N + blank line) always uses newlines regardless.

Output is alphabetically sorted

After deduplication and mode-filtering, the list is sorted alphabetically via Array.prototype.sort(). With Match Case off, the sort is on lowercased forms; on, ASCII uppercase letters sort before lowercase per Unicode code-point order.

Worked example

Nine distinct words after lowercasing. Switch Mode to Exactly Once and the, fox, and quick drop out (they appear more than once), leaving 6 entries.

Input

the quick brown fox jumps over the lazy dog. the fox is quick.

Output

Unique words: 9

brown
dog
fox
is
jumps
lazy
over
quick
the

Settings reference

Option	Effect on output
Match Case (default off)	On: `Word` and `word` are separate. Off: collapsed.
Separator = Newline (default)	One word per line.
Separator = Space	Single space-joined line.
Separator = Comma	`word, word, word` format.
Mode = Distinct (default)	Every unique word once.
Mode = Exactly Once	Only words appearing exactly one time.
Header	Always `Unique words: N` + blank line + the joined list.

FAQ

How is a word defined?

A run of \w with optional internal apostrophes. don't is one word. well-known splits into two words because the hyphen is a boundary here.

Are matches case-sensitive?

No by default. Match Case off lowercases the input before tokenizing, so The and the are the same entry. Turn it on to keep them separate.

What does Exactly Once mode do?

It filters the unique list down to words that appear in the source exactly one time (hapax legomena). Words that appear two or more times are dropped from the output.

How are the words ordered?

Alphabetically via Array.sort(). With Match Case off the sort is on lowercased forms.

Where do I get counts instead of just the list?

Use word frequency: same tokenizer, but each word is paired with its count and (optionally) its percentage of the total.

Also known as

find unique words distinct words list unique vocabulary extractor hapax legomena finder word deduplication list distinct words extract unique terms unique tokens in text

Find every unique word

Distinct word lists, sorted alphabetically

How to use find every unique word

Keyboard shortcuts

What this tool actually does

Tokenizes on `\b[\w']+\b`

Match Case toggle

Mode = Distinct or Exactly Once

Separator picks the joiner

Output is alphabetically sorted

Worked example

Settings reference

FAQ

Also known as

Explore another workspace

Text Formatting

Text Cleaning

Text Conversion

Find & Replace

Generators

Counters & Analysis

Encoding & Security

Text Extraction

Text Comparison

Text Styling

Find every unique word

Distinct word lists, sorted alphabetically

How to use find every unique word

Keyboard shortcuts

What this tool actually does

Tokenizes on \b[\w']+\b

Match Case toggle

Mode = Distinct or Exactly Once

Separator picks the joiner

Output is alphabetically sorted

Worked example

Settings reference

FAQ

Also known as

Text Formatting

Text Cleaning

Text Conversion

Find & Replace

Generators

Counters & Analysis

Encoding & Security

Text Extraction

Text Comparison

Text Styling

Tokenizes on `\b[\w']+\b`