Regex Tester

← Tools

Matches

No matches yet.

What Are Regular Expressions?

A regular expression (regex or regexp) is a sequence of characters that defines a search pattern. Originally formalized in theoretical computer science by mathematician Stephen Kleene in the 1950s, regex has become an indispensable tool in every developer's toolkit for searching, matching, and manipulating text.

At their core, regular expressions describe sets of strings. The pattern \d{3}-\d{4} matches any string containing three digits, a hyphen, and four digits—like a phone number fragment. This declarative approach lets you express complex text-matching rules in a single, compact line instead of writing dozens of lines of procedural string-parsing code.

Regex engines are built into virtually every programming language (JavaScript, Python, PHP, Java, Go, Rust, and more), as well as command-line tools like grep, sed, and awk. Learning regex is a one-time investment that pays off across your entire career.

Regex Syntax Overview

Character classes match a single character from a defined set. [a-z] matches any lowercase letter, [0-9] any digit. Shorthand classes simplify common patterns: \d for digits, \w for word characters (letters, digits, underscore), \s for whitespace. Negated classes like \D match anything except a digit.

Quantifiers control how many times a preceding element must appear. * means zero or more, + one or more, ? zero or one. Curly braces provide precise control: {3} exactly three, {2,5} between two and five. By default, quantifiers are greedy—they match as much as possible. Appending ? makes them lazy (e.g., .*? matches as little as possible).

Groups and capturing. Parentheses () create capturing groups that isolate sub-matches for extraction or back-referencing. Named groups (?<name>...) improve readability. Non-capturing groups (?:...) group without capturing, which is useful when you only need grouping for alternation or quantification.

Anchors assert positions rather than matching characters. ^ matches the start of a line, $ the end. \b marks a word boundary, preventing partial matches (e.g., \bcat\b matches "cat" but not "concatenate").

Lookaheads and lookbehinds are zero-width assertions that check for a pattern without consuming characters. (?=...) is a positive lookahead ("followed by"), (?!...) a negative lookahead ("not followed by"). Lookbehinds (?<=...) and (?<!...) check what precedes the current position. These are powerful for complex validations like password rules.

Common Regex Patterns

Here are battle-tested patterns you can paste directly into the tester above:

Email (simplified): [a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,} — Covers the vast majority of real-world email addresses. For full RFC 5322 compliance you'd need a much longer pattern, but this is practical for most validation scenarios.

URL: https?://[^\s/$.?#].[^\s]* — Matches HTTP and HTTPS URLs. Useful for extracting links from plain text or log files.

IPv4 address: \b(?:\d{1,3}\.){3}\d{1,3}\b — Matches dotted-quad notation. For strict validation (0–255 per octet), you'd need range checks, but this works well for extraction.

Phone number (US): \(?\d{3}\)?[-.\s]?\d{3}[-.\s]?\d{4} — Handles formats like (555) 123-4567, 555.123.4567, and 5551234567.

ISO date: \d{4}-\d{2}-\d{2} — Matches dates in YYYY-MM-DD format, the international standard used in APIs, databases, and file names.

Regex Flavors: PCRE vs JavaScript

Not all regex engines are identical. The two most common flavors developers encounter are PCRE (Perl Compatible Regular Expressions, used in PHP, Nginx, and many server-side tools) and JavaScript's built-in RegExp.

Lookbehinds: PCRE supports variable-length lookbehinds, while JavaScript historically only supported fixed-length lookbehinds (ES2018 added support, but older browsers may not have it).

Named groups: PCRE uses (?P<name>...) syntax, whereas JavaScript uses (?<name>...). Both capture the same way, but the syntax difference can cause portability issues.

Unicode support: JavaScript requires the u flag for full Unicode matching (e.g., /\p{L}/u for any letter). PCRE enables Unicode with the (*UTF8) verb or compile-time flag.

Atomic groups and possessive quantifiers: PCRE supports these performance features ((?>...) and ++), which prevent backtracking. JavaScript has no direct equivalent, though you can sometimes restructure the pattern to achieve the same effect.

This tester uses JavaScript's regex engine, which is what runs in browsers and Node.js. If you're writing regex for a PHP or server-side context, keep the flavor differences in mind.

Performance Tips

Avoid catastrophic backtracking. Patterns like (a+)+$ can cause exponential execution time on non-matching input because the engine explores an astronomical number of paths. When a pattern is slow, look for nested quantifiers or overlapping alternations.

Be specific. Replace .* with more constrained character classes like [^"]* when you know the boundary character. Specific patterns fail faster on non-matching input, which is where regex spends most of its time.

Anchor when possible. Using ^ and $ tells the engine where to start and stop, eliminating unnecessary scanning of the entire input string.

Use non-capturing groups. If you don't need the captured text, use (?:...) instead of (...). Capturing adds memory overhead for storing match groups.

Pre-compile where possible. In languages like JavaScript and Python, compile your regex once outside the loop rather than re-creating it on every iteration. In JavaScript, use a RegExp literal or constructor outside the function scope.

Related Tools

Regex is often just one step in a larger text-processing workflow. These tools complement the regex tester:

  • Diff Tool — Compare before-and-after text to verify that your regex replacement worked correctly.
  • URL Encoder/Decoder — Decode percent-encoded strings before running regex against URL parameters or path segments.

Frequently Asked Questions

Which regex flags are supported?

This tester uses JavaScript's RegExp engine, which supports flags like g (global), i (case-insensitive), m (multiline), s (dotAll), and u (Unicode). Enter your pattern and the engine applies it with the global flag to find all matches.

Why does my regex work in Python but not here?

Subtle differences exist between regex flavors. Python's re module uses a slightly different syntax for features like named groups ((?P<name>...) vs (?<name>...)) and may support features like conditional patterns that JavaScript lacks.

How can I debug a complex regex?

Break it into smaller sub-patterns and test each one individually. Use non-capturing groups to isolate sections. The real-time match highlighting in this tester makes it easy to see exactly which parts of your input are being captured.

Is my test data sent to a server?

No. All pattern matching happens locally in your browser. Nothing is transmitted over the network, so you can safely test patterns against sensitive data.

Should I use regex for HTML parsing?

Generally, no. HTML is not a regular language, and regex cannot reliably handle nested tags, attributes with varying quote styles, and self-closing elements. Use a proper DOM parser instead. Regex is fine for quick one-off extractions from known, simple HTML snippets.