Extract URLs from text

Extract URLs finds every http:// and https:// link in pasted text and lists them one per line. The match rule is https?:\/\/[^\s<>"']+: a literal scheme followed by any run of non-whitespace characters that is not an angle-bracket or quote. Query strings, hashes and ports come through intact. The transform runs in your browser; nothing uploads.

Input
Line 1:1 LF cloud_done Saved locally
Result Extract URLs
0 lines 0 chars

How URL matching works here

The pattern starts at http:// or https:// and grabs every following character that is not whitespace, not an angle-bracket and not a single or double quote. That stops the match at sensible boundaries: a space, a line break, the end of an HTML attribute, or the start of a quoted string. Query strings (?ref=docs), hashes (#install) and ports (:8443) are part of the match.

Schemeless URLs (example.com/page) are not captured. The matcher requires http:// or https:// so you do not get false positives from filenames, version numbers or domain-only mentions. Other schemes like ftp://, mailto: and tel: are also skipped; for those, use extract regex matches with a custom pattern.

Output is one URL per line in the order they appear. Trailing punctuation that sits next to a URL inside a sentence (a full stop, a closing bracket) can sometimes be included if it is not whitespace or a quote; in that case run find and replace on the result to trim the tail.

How to use extract urls from text

  1. 1Paste text containing URLs into the input panel.
  2. 2The output panel shows every http or https URL, one per line.
  3. 3Click Copy to copy the list.
  4. 4Click Download to save it as a plain-text file.
  5. 5Pipe the result through remove duplicate lines if you want unique URLs only.

Keyboard shortcuts

Drive TextResult without touching the mouse.

Shortcut Action
Ctrl FOpen the find & replace panel inside the input Plus
Ctrl ZUndo the last input change
Ctrl Shift ZRedo
Ctrl Shift EnterToggle fullscreen focus on the editor Plus
EscClose find & replace, or exit fullscreen
Ctrl KOpen the command palette to jump to any tool Plus
Ctrl SSave current workflow draft Plus
Ctrl PRun a saved workflow Plus

What counts as a URL here

Scheme is required

The match starts at http:// or https://. Schemeless mentions like example.com are not captured. This avoids false positives on file names and version strings.

Query strings, hashes and ports preserved

?utm_source=site, #section-2 and :8443 are part of the URL and are kept. So https://example.com:8443/path?q=1#top is one match, not three.

Stops at whitespace, angle-brackets and quotes

The character class [^\s<>"'] halts the match at a space, tab, newline, <, >, single or double quote. So URLs inside href="...", plain prose, and angle-bracketed (<https://example.com>) all match cleanly.

Other schemes skipped

ftp://, file://, mailto:, tel: and protocol-relative //host/path are not matched. For those, use extract regex matches with a custom pattern.

Order preserved, duplicates kept

URLs appear in the order they are found. Duplicates are not removed; for a unique list pipe the output into remove duplicate lines.

Worked example

The URL inside href="..." stops at the closing quote. The trailing full stop on prose URLs is included because it is not whitespace; trim it with find and replace if needed.

Input
Visit https://example.com and https://example.org/page?ref=docs.
Mirror at https://example.net:8443/v2#install.
Docs: <a href="https://example.com/help">help</a>.
Output
https://example.com
https://example.org/page?ref=docs.
https://example.net:8443/v2#install.
https://example.com/help

Settings reference

Behaviour Effect on output
Scheme http:// or https:// required.
Other schemes ftp://, mailto:, tel:, //host/... are skipped.
Path, query, hash, port All preserved as part of the match.
Stop characters Whitespace, <, >, single quote, double quote.
Trailing punctuation A full stop or closing bracket sitting next to a URL may be included if not whitespace or quote.
Order and duplicates Matches appear in source order. Duplicates are kept.

FAQ

Will example.com without http match?
No. The pattern requires http:// or https://. This avoids false positives on filenames (config.prod.json), version numbers and casual domain mentions. To match schemeless hosts, use extract regex matches with a pattern like \b(?:[\w-]+\.)+[a-z]{2,}(?:/\S*)?.
How do I get just the host name from each URL?
Extract first to get the full URLs, then run the result through regex replace with pattern https?:\/\/([^\/]+).* and replacement $1 to keep only the host.
Why is a full stop included on the end of some URLs?
The match stops at whitespace, angle-brackets and quotes, but not at full stops or closing brackets, because those characters are valid URL characters. If a sentence ends with a URL followed by ., the dot may be captured. Run find and replace on the output to trim trailing punctuation.
Are duplicates removed?
No. Every URL is listed in source order, duplicates included. Pipe the result through remove duplicate lines for a unique list.
Is anything sent to a server?
No. The match runs entirely in your browser. Nothing uploads, nothing is logged.