Word counting, with the rules made explicit
Word counting starts with a tokenizer. This tool builds a regex from your settings, runs String.match, and reports the array length. With Contractions on the pattern is [\w']+; off, it is \w+. With Hyphenated on the pattern wraps in a hyphen group so well-known stays one token; off, the hyphen breaks the match. The default has both on, which matches how Word and Google Docs count.
The Ignore field accepts a comma or space separated stop-word list. Tokens that match (case-folded unless Match Case is on) are dropped before counting. Use it to skip articles like a, the, and when you want a content-word total. Match Case also affects the Mode = unique total: with the toggle off, The and the collapse to one entry.
For ranked frequency tables, jump to word frequency. For the unique-word list itself, see unique words. For lexical diversity (unique / total ratio), see vocabulary size.
How to use count words in text
- 1Paste or type your text into the input panel on the left.
- 2The word count appears in the output panel as you type.
- 3Type a comma or space separated list into Ignore to skip stop-words.
- 4Toggle Contractions or Hyphenated off if you want apostrophes or hyphens to break words.
- 5Switch Mode to
Unique Wordsto count distinct tokens instead of total occurrences.
Keyboard shortcuts
Drive TextResult without touching the mouse.
| Shortcut | Action |
|---|---|
| Ctrl F | Open the find & replace panel inside the input Plus |
| Ctrl Z | Undo the last input change |
| Ctrl Shift Z | Redo |
| Ctrl Shift Enter | Toggle fullscreen focus on the editor Plus |
| Esc | Close find & replace, or exit fullscreen |
| Ctrl K | Open the command palette to jump to any tool Plus |
| Ctrl S | Save current workflow draft Plus |
| Ctrl P | Run a saved workflow Plus |
What this tool actually does
Default tokenizer matches Word / Docs
The default regex is [\w']+(?:-[\w']+)*. It accepts ASCII letters, digits, underscore, internal apostrophes, and internal hyphens. can't is one word. state-of-the-art is one word. 2024 is one word. Punctuation between words is the boundary.
Contractions toggle: don't as one or two
Default on. Off, the apostrophe is treated as a non-word character: don't splits into don and t, raising the count by one per contraction. Use the off setting when you need an acronym-strict count.
Hyphenated toggle: well-known as one or two
Default on. Off, the hyphen breaks the match: well-known splits into well and known. Off is closer to print-style hyphenated counts; on matches digital editing tools.
Ignore stop-words and Match Case
Type words separated by commas, spaces, or newlines into Ignore and they are dropped before counting. Match is case-folded by default. Turn on Match Case to require an exact case match (so The still counts when only the is ignored).
Mode: All Words versus Unique Words
Default All Words reports total occurrences. Unique Words reports the number of distinct tokens after the ignore-list is applied. With case folding (default), The and the count as one unique entry; with Match Case on they count as two.
Worked example
With defaults, Don't is one word and well-known is one word, so the total is 14. Switch Mode to Unique Words and the output becomes Unique words: 13 (the second the collapses).
The quick brown fox jumps over the lazy dog. Don't forget the well-known shortcut.
Words: 14
Settings reference
| Option | Effect on output |
|---|---|
| Ignore | Comma or space separated stop-words. Tokens matching the list are dropped before counting. |
| Match Case (default off) | On: the ignore list and unique mode are case-sensitive. Off: case is folded. |
| Mode = All Words (default) | Reports total occurrences as Words: N. |
| Mode = Unique Words | Reports the count of distinct tokens as Unique words: N. |
| Contractions (default on) | Apostrophe is a word character: don't = 1 word. Off: 2 words. |
| Hyphenated (default on) | Hyphen is internal: well-known = 1 word. Off: 2 words. |
FAQ
How is a word defined?
Does don't count as one word or two?
don and t.How do I skip articles like a, the, and?
What is the difference between All Words and Unique Words?
Does it work with non-ASCII letters?
\w in JavaScript matches ASCII [A-Za-z0-9_] only. Accented Latin letters and non-Latin scripts are not part of the default match. Paste already-transliterated text if you need pure-ASCII counting, or use text statistics which uses an \S+ token (any non-whitespace run).