Remove non-ASCII characters from text

Remove non-ASCII characters from text by stripping every codepoint outside the basic ASCII range U+0000 to U+007F. Accented letters (é, ñ), smart quotes (, ), em dashes (), and emoji all disappear, leaving only plain 7-bit ASCII. Use this when feeding text into systems that cannot handle Unicode. The transform runs in your browser. For just diacritic stripping, see remove accents.

Input
Line 1:1 LF cloud_done Saved locally
Result Remove Non-ASCII
0 lines 0 chars

How non-ASCII removal works

The pass is a single regex replacement: /[^\x00-\x7F]/g matches every codepoint outside the range U+0000 to U+007F and replaces it with an empty string. ASCII includes the standard English alphabet, digits, common punctuation, control characters, and the space and tab. Anything else (accented letters, currency symbols beyond $, smart quotes, dashes, emoji) is dropped.

This is a hard cut: there is no transliteration step. café becomes caf, not cafe; the é is removed entirely along with its accent. To preserve the base letter, run remove accents first, which decomposes é into e + combining mark, then strips the mark. The chain produces cafe.

Output is computed in your browser on every keystroke as one regex call. The pass is fast even on huge inputs because the regex engine walks the string once, copying through ASCII characters and skipping the rest. Common follow-ups are remove em dashes (run before this to convert em dashes to hyphens rather than deleting them) and remove extra spaces (to clean up gaps left by removed characters).

How to use remove non-ascii characters from text

  1. 1Paste text containing non-ASCII characters into the input panel.
  2. 2Read the result on the right with everything outside ASCII stripped.
  3. 3For accented letters, run remove accents first to keep the base letter.
  4. 4Click Copy to take the ASCII-only text.
  5. 5Pair with remove extra spaces if gaps need tidying.

Keyboard shortcuts

Drive TextResult without touching the mouse.

Shortcut Action
Ctrl FOpen the find & replace panel inside the input Plus
Ctrl ZUndo the last input change
Ctrl Shift ZRedo
Ctrl Shift EnterToggle fullscreen focus on the editor Plus
EscClose find & replace, or exit fullscreen
Ctrl KOpen the command palette to jump to any tool Plus
Ctrl SSave current workflow draft Plus
Ctrl PRun a saved workflow Plus

What this tool actually does

Hard cut at codepoint U+007F

Every character with codepoint above 127 is removed. That covers Latin-1 supplement (Latin accented letters), Latin Extended, Greek, Cyrillic, CJK ideographs, Arabic, Hebrew, emoji, and every modern Unicode block. Only the original 7-bit ASCII range survives.

Smart quotes and dashes go

Curly quotes (, , , ), em dashes (), en dashes (), ellipsis (), and other typographic punctuation are all outside ASCII and get stripped. Run remove em dashes first if you want them converted to hyphens rather than deleted.

Emoji and pictographs disappear

Every emoji codepoint sits well above U+007F (mostly U+1F300+), so they are removed in this pass. Variation selectors and zero-width joiners often used in emoji sequences also go. Use remove emoji for a more targeted strip that leaves other Unicode in place.

No transliteration, just deletion

Characters are dropped, not converted. é becomes nothing, not e. To keep the base letter, run remove accents first; that decomposes accented letters and strips just the combining marks, then this tool finishes off any non-ASCII remainder.

Single linear regex pass

Implemented as s.replace(/[^\x00-\x7F]/g, ''). The engine walks the input once, copying ASCII through and dropping the rest. Fast on huge inputs, and computed in your browser on every keystroke without a server round trip.

Worked example

Accented letters are deleted along with their accents. Smart quotes and the em dash also go. Run remove accents first to keep base letters.

Input
Café résumé naïve façade
She said “hello” — but he said ‘hi’.
Output
Caf rsum nave faade
She said hello  but he said 'hi'.

Settings reference

Behaviour Effect on output
ASCII letters, digits, common punctuation Pass through unchanged.
Accented Latin letters (é, ñ, ü) Removed entirely. Run remove accents first to keep the base letter.
Smart quotes and ellipsis Removed. Use find and replace first to convert to ASCII equivalents.
Em dash and en dash Removed. Use remove em dashes first to convert to hyphens.
Emoji and pictographs Removed. Use remove emoji for a targeted strip.
CJK, Arabic, Hebrew, Cyrillic, Greek Removed. The whole codepoint is dropped.
Whitespace and line endings ASCII space, tab, LF, and CR pass through. Non-breaking space (U+00A0) is removed.

FAQ

Will accented letters become their plain equivalents?
No. The pass deletes every non-ASCII codepoint outright, so é becomes nothing rather than e. To preserve the base letter, run remove accents first; that decomposes the accented letter and strips the combining mark, leaving the base. Then this tool removes anything still outside ASCII.
Does it remove smart quotes and em dashes?
Yes. Curly quotes (, , , ) and em or en dashes (, ) are all outside ASCII and get stripped. To convert them to ASCII equivalents instead of deleting, use find and replace for quotes and remove em dashes for the dash before this pass.
What about emoji?
Removed. Emoji codepoints sit well above U+007F, so they fail the ASCII test and disappear. The strip also catches variation selectors and zero-width joiners often used in compound emoji. For a targeted emoji strip that leaves other Unicode alone, use remove emoji.
Is non-breaking space treated as ASCII?
No. The non-breaking space sits at U+00A0, just outside the ASCII range. So it is removed by this pass. Regular ASCII spaces (U+0020) and tabs (U+0009) pass through. Worth knowing if your input was copied from a webpage or word processor that uses NBSP for layout.
Is the input uploaded?
No. The pass is a single regex replacement evaluated in your browser on every keystroke. Nothing about the text you paste leaves the page or is logged on our servers.