Calculate Shannon entropy of text

Paste any text and get its Shannon entropy in bits per character, plus the unique-character and total-character counts. Switch Scope to compute per paragraph or per line, and pick how many Decimals the figure should show. The transform runs in your browser; nothing uploads. For other text statistics, see text statistics.

Input
Line 1:1 LF cloud_done Saved locally
Result Text Entropy
0 lines 0 chars

Shannon entropy, computed character by character

Shannon entropy is H = -sum(p_i * log2 p_i), where p_i is the empirical probability of the i-th character. The tool builds the character distribution by tallying every character in the input, normalises to probabilities, and applies the formula. The unit is bits per character: an entropy of 4 means each character carries on average 4 bits of information given the rest of the distribution.

Typical values for English prose are 4.0 to 4.5 bits per character (with the lowercase alphabet plus space and common punctuation, the maximum is around 4.7 for uniform letters). Random ASCII (mixed case, digits, punctuation) approaches 6.5. Highly repetitive text (aaaaaa) approaches 0. Compressed binary or random bytes interpreted as text can exceed 7.5.

Scope picks the unit. Whole Text (default) reports one figure plus the unique- and total-character counts. Per Line reports Line N: H, one per input line. Per Paragraph reports Paragraph N: H, one per non-empty paragraph (paragraphs split on blank lines). Decimals controls how many digits the entropy is shown to (0 to 6, default 3).

How to use calculate shannon entropy of text

  1. 1Paste or type your text into the input panel on the left.
  2. 2The entropy figure appears in the output panel as you type.
  3. 3Switch Scope to Per Line or Per Paragraph for finer-grained measurement.
  4. 4Set Decimals to widen or narrow the precision.
  5. 5Click Copy in the output header to copy the result.

Keyboard shortcuts

Drive TextResult without touching the mouse.

Shortcut Action
Ctrl FOpen the find & replace panel inside the input Plus
Ctrl ZUndo the last input change
Ctrl Shift ZRedo
Ctrl Shift EnterToggle fullscreen focus on the editor Plus
EscClose find & replace, or exit fullscreen
Ctrl KOpen the command palette to jump to any tool Plus
Ctrl SSave current workflow draft Plus
Ctrl PRun a saved workflow Plus

What this tool actually does

Shannon entropy formula

H = -sum(p * log2 p) over every character in the input. The probability p for each character is its frequency divided by the total length. Empty input reports 0.

Character-level, code-unit-by-code-unit

The tally indexes by JavaScript string position (UTF-16 code units), so an emoji surrogate pair contributes two distinct characters to the distribution. For BMP-only Latin text this is equivalent to byte-level entropy on the corresponding UTF-16 stream.

Scope = Whole, Per Line, or Per Paragraph

Default Whole Text emits three lines: entropy, unique-characters, total-characters. Per Line emits Line N: H per input line. Per Paragraph splits on \n\n+ and emits Paragraph N: H per non-empty paragraph.

Decimals sets precision

Default 3 (e.g. 4.123). Range 0 to 6. Set higher to compare similar inputs; set lower for a clean integer.

Reads as a relative measure

Entropy is most useful for comparison: encrypted vs. plain, English vs. random, source A vs. source B. Absolute values depend on the alphabet and the unit, so do not over-interpret a single number in isolation.

Worked example

Forty-three characters across 17 distinct ones (the pangram has every letter of the alphabet plus space, but some letters repeat). Switch Scope to Per Line and the output collapses to a single Line 1: 4.099 entry.

Input
the quick brown fox jumps over the lazy dog
Output
Shannon entropy: 4.099 bits per character
Unique characters: 17
Total characters: 43

Settings reference

Option Effect on output
Scope = Whole Text (default) Three lines: entropy, unique-chars, total-chars.
Scope = Per Line One Line N: H per input line.
Scope = Per Paragraph One Paragraph N: H per non-empty paragraph.
Decimals (default 3) Digits after the decimal point. 0 to 6.
Empty input Reports Entropy: 0 bits/char.
Unit Bits per character (log base 2).

FAQ

What is a typical entropy for English text?
Around 4.0 to 4.5 bits per character at the character level (mixed case, spaces, common punctuation). Random ASCII goes higher; pure repetitive text goes lower.
Does the tool count emoji as one character or two?
Two, because JavaScript strings are UTF-16 and most emoji are surrogate pairs. The tally is over code units, not grapheme clusters.
How do I score each paragraph separately?
Switch Scope to Per Paragraph. Paragraphs are split on blank lines (\n\n+) and each gets its own line of output.
What does an entropy of 0 mean?
Every character is the same, or the input is empty. Entropy rises with diversity in the character distribution and is maximised when every character appears equally often.
Is the input sent anywhere?
No. The computation runs in your browser. Nothing uploads, nothing is logged.