Find every unique word

Paste any text and get a list of every unique word, alphabetised. Toggle Match Case to keep Word separate from word, switch Separator between newline, space, or comma, and flip Mode to Exactly Once to filter to hapax legomena (words that appear just once). The transform runs in your browser; nothing uploads. For counts and ranks, see word frequency.

Input
Line 1:1 LF cloud_done Saved locally
Result Unique Words
0 lines 0 chars

Distinct word lists, sorted alphabetically

The tokenizer is \b[\w']+\b: any run of word characters with optional internal apostrophes. Hyphens are word boundaries here, so well-known contributes well and known as separate entries. The list is deduplicated and then sorted alphabetically before being joined with the chosen separator.

The output starts with a one-line header: Unique words: N, where N is the count of distinct entries after the case and mode filters. A blank line follows, then the joined list. With Separator = Newline (default) you get one word per line, ready to paste into a spreadsheet column. With Comma you get a comma-space separated string suitable for use as a tag list.

Mode = Distinct (default) lists every unique word once. Exactly Once lists only hapax legomena: words that appear in the source exactly one time. The latter is useful when proofreading for typos (a name spelled two ways will appear twice and so be excluded) or surfacing rare vocabulary.

How to use find every unique word

  1. 1Paste or type your text into the input panel on the left.
  2. 2The unique-word list appears in the output panel as you type.
  3. 3Switch Separator to Comma for a tag-style list, or Space for a single line.
  4. 4Turn on Match Case if Word and word should be different entries.
  5. 5Switch Mode to Exactly Once to filter to words that appear just once.

Keyboard shortcuts

Drive TextResult without touching the mouse.

Shortcut Action
Ctrl FOpen the find & replace panel inside the input Plus
Ctrl ZUndo the last input change
Ctrl Shift ZRedo
Ctrl Shift EnterToggle fullscreen focus on the editor Plus
EscClose find & replace, or exit fullscreen
Ctrl KOpen the command palette to jump to any tool Plus
Ctrl SSave current workflow draft Plus
Ctrl PRun a saved workflow Plus

What this tool actually does

Tokenizes on \b[\w']+\b

Words are runs of \w with optional internal apostrophes. Hyphens are word boundaries here (different from word counter). Punctuation passes through as a boundary; numbers count as words.

Match Case toggle

Default off: the input is lowercased before tokenizing, so The and the collapse into a single entry. On: case is preserved and capitalised variants stay separate. Useful when proper nouns matter.

Mode = Distinct or Exactly Once

Default Distinct emits every unique word once. Exactly Once filters to words that appear in the source exactly one time (hapax legomena). The header line still shows Unique words: N for the filtered count.

Separator picks the joiner

Newline (default) produces one word per line. Comma produces word, word, word. Space produces a single space-joined line. The header (Unique words: N + blank line) always uses newlines regardless.

Output is alphabetically sorted

After deduplication and mode-filtering, the list is sorted alphabetically via Array.prototype.sort(). With Match Case off, the sort is on lowercased forms; on, ASCII uppercase letters sort before lowercase per Unicode code-point order.

Worked example

Nine distinct words after lowercasing. Switch Mode to Exactly Once and the, fox, and quick drop out (they appear more than once), leaving 6 entries.

Input
the quick brown fox jumps over the lazy dog. the fox is quick.
Output
Unique words: 9

brown
dog
fox
is
jumps
lazy
over
quick
the

Settings reference

Option Effect on output
Match Case (default off) On: Word and word are separate. Off: collapsed.
Separator = Newline (default) One word per line.
Separator = Space Single space-joined line.
Separator = Comma word, word, word format.
Mode = Distinct (default) Every unique word once.
Mode = Exactly Once Only words appearing exactly one time.
Header Always Unique words: N + blank line + the joined list.

FAQ

How is a word defined?
A run of \w with optional internal apostrophes. don't is one word. well-known splits into two words because the hyphen is a boundary here.
Are matches case-sensitive?
No by default. Match Case off lowercases the input before tokenizing, so The and the are the same entry. Turn it on to keep them separate.
What does Exactly Once mode do?
It filters the unique list down to words that appear in the source exactly one time (hapax legomena). Words that appear two or more times are dropped from the output.
How are the words ordered?
Alphabetically via Array.sort(). With Match Case off the sort is on lowercased forms.
Where do I get counts instead of just the list?
Use word frequency: same tokenizer, but each word is paired with its count and (optionally) its percentage of the total.