Remove zero-width characters from text

Remove zero-width characters from text by stripping the invisible Unicode codepoints that take up no display space but still count as characters: zero-width space (U+200B), zero-width non-joiner (U+200C), zero-width joiner (U+200D), byte order mark (U+FEFF), and the word joiner (U+2060). These often hide in copy-pasted text, AI output, and exports from word processors, breaking string comparisons and search. The transform runs in your browser. For visible-but-invisible ASCII control codes, see remove control chars.

Input
Line 1:1 LF cloud_done Saved locally
Result Remove Zero-Width Chars
0 lines 0 chars

How zero-width removal works

The pass is a single regex: /[​-‍⁠]/g. Each match is replaced with an empty string. The set covers the most common zero-width characters: ZWSP (U+200B, used for invisible word breaks), ZWNJ (U+200C, prevents ligatures in scripts like Persian), ZWJ (U+200D, joins emoji components and Indic letters), the BOM (U+FEFF, often left at the start of files), and the word joiner (U+2060, prevents line breaking).

These characters are zero-width in display but full-width in storage: a string with a hidden ZWSP is one character longer than it looks, and string equality fails between the visible-identical version and the version with the hidden character. They often appear in text copied from word processors, AI-generated copy, or pasted from rich-text editors that use them for layout tricks or invisible watermarking.

After the pass, the visible text is unchanged but the underlying string is exactly what it appears to be. Pair with remove control chars for ASCII control bytes, or with remove non-ASCII if you want to flatten to plain 7-bit text. Output is computed in your browser on every keystroke, no upload.

How to use remove zero-width characters from text

  1. 1Paste copy-pasted text from a webpage, document, or AI chat into the input.
  2. 2Read the cleaned result on the right with invisible characters stripped.
  3. 3Compare lengths in the input vs output to confirm hidden characters were removed.
  4. 4Click Copy to take the cleaned text.
  5. 5Run remove control chars after if ASCII control bytes also lurk.

Keyboard shortcuts

Drive TextResult without touching the mouse.

Shortcut Action
Ctrl FOpen the find & replace panel inside the input Plus
Ctrl ZUndo the last input change
Ctrl Shift ZRedo
Ctrl Shift EnterToggle fullscreen focus on the editor Plus
EscClose find & replace, or exit fullscreen
Ctrl KOpen the command palette to jump to any tool Plus
Ctrl SSave current workflow draft Plus
Ctrl PRun a saved workflow Plus

What this tool actually does

Strips the standard zero-width set

Targets U+200B (ZWSP), U+200C (ZWNJ), U+200D (ZWJ), U+FEFF (BOM), and U+2060 (word joiner). These are the codepoints commonly used for invisible breaks, ligature control, emoji joining, and document markers.

Visible text completely unchanged

Letters, digits, punctuation, whitespace, and emoji visible in your input pass through with no modification. Only the invisible characters in the targeted set are dropped, so the on-screen text is identical before and after.

Removes BOM at the start of files

The byte order mark (U+FEFF) often appears as the first character of UTF-8 files exported from Windows tools. JSON parsers and shell scripts choke on it. This pass strips every occurrence, including a BOM at the very beginning.

Affects emoji that rely on ZWJ

Compound emoji like family sequences are built using zero-width joiners. Stripping them splits the emoji into its component parts. To strip emoji wholesale instead, use remove emoji, which targets the full pictograph ranges directly.

Single regex pass, no upload

The transform is one String.replace call evaluated in your browser on every keystroke. Linear time on huge inputs. Nothing about the text leaves the page, no log is kept on our servers, and the output panel updates without a network round trip.

Worked example

The hidden ZWSP between Hello and world disappears, and the BOM between name and value is stripped. Visible characters are untouched.

Input
Hello​world
namevalue
Output
Helloworld
namevalue

Settings reference

Behaviour Effect on output
Zero-width space (U+200B) Stripped. Often used for invisible word breaks in HTML.
Zero-width non-joiner (U+200C) Stripped. Used in Persian and Arabic to prevent ligatures.
Zero-width joiner (U+200D) Stripped. Used to combine emoji components and Indic letters.
Byte order mark (U+FEFF) Stripped. Common stray character at the start of UTF-8 files.
Word joiner (U+2060) Stripped. Prevents line breaks between adjacent characters.
Visible characters and emoji bodies Pass through unchanged. Only the targeted invisible codepoints are removed.
Other invisible Unicode Pass through. This pass targets only the listed five codepoints.

FAQ

Why does my text look identical before and after?
Zero-width characters take up no visible space, so the rendered text looks the same. Compare the character counts in the input and output panels: a difference means hidden characters were removed. The strings are now equal as string-equal, which matters for search, comparison, and exports.
Will it break compound emoji?
Yes. Compound emoji like the family sequences are built using zero-width joiners (U+200D). Stripping them splits the compound emoji into its component parts. If your goal is to remove emoji wholesale, use remove emoji, which targets the pictograph ranges directly.
What about the BOM at the start of a file?
Removed. The byte order mark (U+FEFF) often appears as the first character in UTF-8 files exported from Windows tools, where it confuses JSON parsers and shell scripts. This pass strips every BOM in the text, including the one at the very start.
Are there other invisible Unicode characters this misses?
Yes. The pass targets only the five most common offenders. Less common invisible characters (variation selectors, soft hyphens, mongolian vowel separator) pass through. Use find and replace with explicit codepoints if you need to target them, or remove non-ASCII for a wholesale flatten.
Is the input uploaded?
No. The pass is a single regex replacement evaluated in your browser. Nothing about the text leaves the page or is logged on our servers, and the output panel updates with no network round trip.