Rank words by frequency

Paste any text and get a frequency table: each unique word with how many times it appears. Switch Group Size to 2 for bigrams, 3 for trigrams, up to 6. Toggle the count column, percentages, a total footer, alphabetical sort, and case folding. The transform runs in your browser; nothing uploads. For a single number instead of a table, see word counter.

Input
Line 1:1 LF cloud_done Saved locally
Result Word Frequency
0 lines 0 chars

Frequency tables, configurable to the column

The tool tokenizes on \b[\w']+\b (any run of word characters, optionally with internal apostrophes) and counts occurrences. With Group Size = 1 you get a word-frequency table. With 2, the count is over consecutive word pairs (bigrams), so "the fox" and "fox is" are separate keys. The maximum group size is 6.

Pre-processing is governed by two toggles. Strip Punct (on by default) replaces .,;:!?'"()[]{}*@#$%^&+=`~/\|<>_- with spaces before tokenizing, so "fox." and "fox" collapse. Ignore Case (on) lowercases the input first, so The and the are the same key. Stop at . bounds groups inside sentences only; bigrams will not span across ., !, or ?.

Output formatting is also yours: Show Count prepends the count column, Show % prepends a percentage of total, Show Total appends a Total: N footer, and Sort picks By Count (default), Alphabetical, or Insertion Order. Output is capped at 200 entries to keep the panel readable. For just the unique words without counts, see find unique words.

How to use rank words by frequency

  1. 1Paste or type your text into the input panel on the left.
  2. 2The frequency table appears in the output panel, sorted by count.
  3. 3Switch Group Size to 2 for bigrams, 3 for trigrams.
  4. 4Toggle Show % or Show Total in the action bar to add columns.
  5. 5Switch Sort to Alphabetical or Insertion Order if count-sorting is not what you want.

Keyboard shortcuts

Drive TextResult without touching the mouse.

Shortcut Action
Ctrl FOpen the find & replace panel inside the input Plus
Ctrl ZUndo the last input change
Ctrl Shift ZRedo
Ctrl Shift EnterToggle fullscreen focus on the editor Plus
EscClose find & replace, or exit fullscreen
Ctrl KOpen the command palette to jump to any tool Plus
Ctrl SSave current workflow draft Plus
Ctrl PRun a saved workflow Plus

What this tool actually does

Tokenizes on \b[\w']+\b

Words are runs of \w with optional internal apostrophes. Hyphenated terms split on the hyphen here (different from word counter). Punctuation is stripped first when Strip Punct is on, which is the default.

Group Size 1 to 6 for n-grams

Default 1 = unigrams (word frequencies). 2 = bigrams (consecutive pairs). 3 = trigrams. Up to 6. Higher group sizes produce many more keys; the output cap of 200 entries still applies.

Stop at .: respect sentence boundaries

When on, the input is split on ., !, ? before n-gram extraction. A bigram will not include the last word of one sentence and the first word of the next. Off (default), n-grams span freely across sentences.

Strip Punct and Ignore Case

Strip Punct (default on) replaces a wide set of ASCII punctuation with spaces before tokenizing, so fox. and fox collapse into the same key. Ignore Case (default on) lowercases the input first, so The and the collapse. Turn either off to keep the original variants distinct.

Output columns and sort

Show Count (default on) writes a count column. Show % (default off) adds a percent-of-total column. Show Total (default off) adds a final Total: N line. Sort defaults to By Count; switch to Alphabetical or Insertion Order to change the row order. The output is tab-separated and capped at 200 rows.

Worked example

Tab-separated, sorted by count descending. the appears 3 times, quick and fox twice each. Switch Group Size to 2 and the table becomes bigrams: the quick, quick brown, etc.

Input
the quick brown fox jumps over the lazy dog. the fox is quick.
Output
3	the
2	quick
2	fox
1	brown
1	jumps
1	over
1	lazy
1	dog
1	is

Settings reference

Option Effect on output
Group Size (default 1) n-gram length. 1 = words, 2 = bigrams, up to 6.
Stop at . (default off) On: n-grams cannot span sentence boundaries (., !, ?).
Strip Punct (default on) Replace ASCII punctuation with spaces before tokenizing.
Show Count (default on) Prepend a count column to each row.
Show % (default off) Prepend a percent-of-total column.
Show Total (default off) Append a final Total: N line.
Sort (default By Count) Row order: By Count, Alphabetical, or Insertion Order.
Ignore Case (default on) On: The and the are one key. Off: separate keys.
Output cap Top 200 rows shown.

FAQ

How are words tokenized?
On \b[\w']+\b (word chars with optional internal apostrophes). Hyphens are word boundaries here, so well-known splits into well and known.
How do I get bigrams or trigrams?
Set Group Size to 2 for bigrams, 3 for trigrams, up to 6. Each consecutive run of N words becomes one key, joined by a single space.
Why are The and the grouped together?
Because Ignore Case is on by default. Turn it off in the action bar to keep capitalised variants as separate keys.
Can I get percentages?
Yes. Turn on Show % in the action bar. Each row gets a percent-of-total column. Combine with Show Total to add a final Total: N line.
Why is the output capped at 200 rows?
For panel readability. If you need more, copy the output, paste it back into the input, and apply your own filtering, or use a CSV export workflow with find unique words first.