Text Entropy - Shannon Entropy Calculator

Shannon entropy, computed character by character

Shannon entropy is H = -sum(p_i * log2 p_i), where p_i is the empirical probability of the i-th character. The tool builds the character distribution by tallying every character in the input, normalises to probabilities, and applies the formula. The unit is bits per character: an entropy of 4 means each character carries on average 4 bits of information given the rest of the distribution.

Typical values for English prose are 4.0 to 4.5 bits per character (with the lowercase alphabet plus space and common punctuation, the maximum is around 4.7 for uniform letters). Random ASCII (mixed case, digits, punctuation) approaches 6.5. Highly repetitive text (aaaaaa) approaches 0. Compressed binary or random bytes interpreted as text can exceed 7.5.

Scope picks the unit. Whole Text (default) reports one figure plus the unique- and total-character counts. Per Line reports Line N: H, one per input line. Per Paragraph reports Paragraph N: H, one per non-empty paragraph (paragraphs split on blank lines). Decimals controls how many digits the entropy is shown to (0 to 6, default 3).

How to use calculate shannon entropy of text

1Paste or type your text into the input panel on the left.
2The entropy figure appears in the output panel as you type.
3Switch Scope to Per Line or Per Paragraph for finer-grained measurement.
4Set Decimals to widen or narrow the precision.
5Click Copy in the output header to copy the result.

Keyboard shortcuts

Drive TextResult without touching the mouse.

Shortcut	Action
`Ctrl` `F`	Open the find & replace panel inside the input Plus
`Ctrl` `Z`	Undo the last input change
`Ctrl` `Shift` `Z`	Redo
`Ctrl` `Shift` `Enter`	Toggle fullscreen focus on the editor Plus
`Esc`	Close find & replace, or exit fullscreen
`Ctrl` `K`	Open the command palette to jump to any tool Plus
`Ctrl` `S`	Save current workflow draft Plus
`Ctrl` `P`	Run a saved workflow Plus

What this tool actually does

Shannon entropy formula

H = -sum(p * log2 p) over every character in the input. The probability p for each character is its frequency divided by the total length. Empty input reports 0.

Character-level, code-unit-by-code-unit

The tally indexes by JavaScript string position (UTF-16 code units), so an emoji surrogate pair contributes two distinct characters to the distribution. For BMP-only Latin text this is equivalent to byte-level entropy on the corresponding UTF-16 stream.

Scope = Whole, Per Line, or Per Paragraph

Default Whole Text emits three lines: entropy, unique-characters, total-characters. Per Line emits Line N: H per input line. Per Paragraph splits on \n\n+ and emits Paragraph N: H per non-empty paragraph.

Decimals sets precision

Default 3 (e.g. 4.123). Range 0 to 6. Set higher to compare similar inputs; set lower for a clean integer.

Reads as a relative measure

Entropy is most useful for comparison: encrypted vs. plain, English vs. random, source A vs. source B. Absolute values depend on the alphabet and the unit, so do not over-interpret a single number in isolation.

Worked example

Forty-three characters across 17 distinct ones (the pangram has every letter of the alphabet plus space, but some letters repeat). Switch Scope to Per Line and the output collapses to a single Line 1: 4.099 entry.

Input

the quick brown fox jumps over the lazy dog

Output

Shannon entropy: 4.099 bits per character
Unique characters: 17
Total characters: 43

Settings reference

Option	Effect on output
Scope = Whole Text (default)	Three lines: entropy, unique-chars, total-chars.
Scope = Per Line	One `Line N: H` per input line.
Scope = Per Paragraph	One `Paragraph N: H` per non-empty paragraph.
Decimals (default 3)	Digits after the decimal point. 0 to 6.
Empty input	Reports `Entropy: 0 bits/char`.
Unit	Bits per character (log base 2).

FAQ

What is a typical entropy for English text?

Around 4.0 to 4.5 bits per character at the character level (mixed case, spaces, common punctuation). Random ASCII goes higher; pure repetitive text goes lower.

Does the tool count emoji as one character or two?

Two, because JavaScript strings are UTF-16 and most emoji are surrogate pairs. The tally is over code units, not grapheme clusters.

How do I score each paragraph separately?

Switch Scope to Per Paragraph. Paragraphs are split on blank lines (\n\n+) and each gets its own line of output.

What does an entropy of 0 mean?

Every character is the same, or the input is empty. Entropy rises with diversity in the character distribution and is maximised when every character appears equally often.

Is the input sent anywhere?

No. The computation runs in your browser. Nothing uploads, nothing is logged.

Also known as

shannon entropy calculator text entropy tool information entropy character entropy bits per character entropy measurement text randomness score shannon information