Regular Expressions Find and Replace: The Complete Guide for 2026
Regular expressions are one of the most powerful tools in a developer's arsenal, and regex find and replace is where that power becomes most practical. Whether you are renaming thousands of variables across a codebase, reformatting dates in a log file, or cleaning up messy data in a CSV, regex replace turns hours of manual editing into a single command.
This guide covers everything you need to know about regex find and replace in 2026. We start with the fundamental syntax of capture groups and backreferences, move through replacement patterns in every major tool and language, and finish with 20+ practical recipes you can copy and adapt for your own work. By the end, you will be able to write confident, correct regex replacements in code editors, on the command line, and in your programming language of choice.
How Regex Find and Replace Works
At its core, regex find and replace is a two-step operation. First, a regular expression pattern matches portions of the input text. Second, a replacement string specifies what to substitute in place of each match. The replacement string can include literal text, references to captured groups from the pattern, and special replacement tokens.
Consider a simple example. You have a list of dates in MM/DD/YYYY format and need to convert them to YYYY-MM-DD:
# Pattern (find):
(\d{2})/(\d{2})/(\d{4})
# Replacement (replace with):
$3-$1-$2
# Input: 02/11/2026
# Output: 2026-02-11
The parentheses in the pattern create capture groups. The first group (\d{2}) captures the month, the second captures the day, and the third (\d{4}) captures the year. In the replacement string, $1, $2, and $3 refer back to those captured values, allowing you to rearrange them into a new format.
This is the fundamental mechanism that makes regex replace so powerful: you can match a pattern, capture pieces of that pattern, and reassemble those pieces in any order with any additional text.
Capture Groups and Backreferences
Capture groups are the building blocks of regex replacement. Every pair of parentheses in your pattern creates a numbered group, starting from 1, that you can reference in the replacement string.
Numbered Capture Groups
Groups are numbered left to right by their opening parenthesis:
# Pattern:
((\w+)\s(\w+))@(\w+\.\w+)
# Groups:
# $1 = full name (john doe) — outer group
# $2 = first name (john) — first inner group
# $3 = last name (doe) — second inner group
# $4 = domain (example.com) — last group
# Input: john doe@example.com
# Replace: $2.$3@$4
# Output: john.doe@example.com
When groups are nested, the outer group always gets the lower number because numbering follows the position of the opening parenthesis.
Named Capture Groups
For complex patterns, numbered groups become hard to track. Named groups solve this by letting you assign meaningful names:
# Pattern (JavaScript/Python/PCRE):
(?<year>\d{4})-(?<month>\d{2})-(?<day>\d{2})
# Replacement (varies by tool):
# JavaScript: $<month>/$<day>/$<year>
# Python: \g<month>/\g<day>/\g<year>
# VS Code: $<month>/$<day>/$<year> (since 2020)
# sed: does not support named groups
Named groups make your regex self-documenting. When you revisit a replacement six months later, $<year> is immediately clear, while $3 requires re-reading the entire pattern to understand.
Non-Capturing Groups
Sometimes you need parentheses for grouping (alternation, quantifiers) but do not want the group to receive a number. Non-capturing groups use (?:...) syntax:
# Without non-capturing group:
(https?://)(\w+\.example\.com)(/\S+)
# $1 = protocol, $2 = domain, $3 = path
# With non-capturing group:
(?:https?://)(\w+\.example\.com)(/\S+)
# $1 = domain, $2 = path (protocol is matched but not captured)
Non-capturing groups keep your group numbers clean when you only need some parts of the match in the replacement.
Special Replacement Patterns
Beyond capture group references, most regex engines support special tokens in the replacement string. These vary slightly between tools, but the core set is widely supported.
The Full Match: $& (or $0)
$& (or $0 in some engines) inserts the entire matched text. This is useful when you want to wrap or surround a match without capturing anything specific:
# Wrap all numbers in square brackets
# Pattern: \d+
# Replace: [$&]
# Input: There are 42 items in 3 boxes
# Output: There are [42] items in [3] boxes
# Add quotes around words starting with uppercase
# Pattern: \b[A-Z]\w+
# Replace: "$&"
# Input: Welcome to Paris and London
# Output: Welcome to "Paris" and "London"
Pre-Match and Post-Match: $` and $'
These are less commonly used but available in JavaScript and Perl:
# $` = text BEFORE the match
# $' = text AFTER the match
# These are primarily useful in JavaScript:
"hello world".replace(/world/, "[$`]")
// Result: "hello [hello ]"
// $` contains everything before "world"
In practice, $` and $' are rarely needed. They exist for completeness but most real-world replacements use capture groups instead.
Literal Dollar Sign: $$
Since $ has special meaning in replacement strings, you need to escape it to insert a literal dollar sign:
# Pattern: (\d+)
# Replace: $$$1 (that is: $$ followed by $1)
# Input: Price: 42
# Output: Price: $42
# In JavaScript:
"42 euros".replace(/(\d+)/, "$$$1")
// Result: "$42 euros"
Case Conversion in Replacements
Some engines support case-changing tokens in the replacement string. This is supported in VS Code, Vim, Perl, and some others:
# VS Code / Vim / Perl replacement modifiers:
# \u — Uppercase next character
# \l — Lowercase next character
# \U — Uppercase everything until \E
# \L — Lowercase everything until \E
# \E — End case conversion
# Convert to UPPER CASE in VS Code:
# Find: (\w+)
# Replace: \U$1
# Title Case first letter:
# Find: (\w)(\w+)
# Replace: \u$1\L$2\E
Regex Replace in Code Editors
Code editors are where most developers first encounter regex find and replace. The ability to transform code across an entire file or project with a single operation is transformative.
VS Code
VS Code uses JavaScript-style regex with some extensions. Open Find and Replace with Ctrl+H (or Cmd+H on macOS), then click the .* icon to enable regex mode.
# VS Code replacement syntax:
# $1, $2, ... — numbered capture groups
# $0 — entire match
# $<name> — named capture groups
# \n — newline in replacement
# \t — tab in replacement
# \u, \l, \U, \L, \E — case modifiers
# Example: Convert console.log to logger.info
# Find: console\.log\(([^)]+)\)
# Replace: logger.info($1)
# Example: Add TypeScript types to function parameters
# Find: function (\w+)\((\w+), (\w+)\)
# Replace: function $1($2: string, $3: number)
# Example: Convert double quotes to single quotes
# Find: "([^"]*)"
# Replace: '$1'
VS Code also supports multi-file regex replace. Open the search sidebar (Ctrl+Shift+H), enable regex, type your pattern, and replace across your entire workspace. You can include/exclude files by glob pattern, making it easy to target only TypeScript files or skip node_modules.
Sublime Text
Sublime Text uses the Boost regex engine, which supports Perl-compatible syntax. Open Find and Replace with Ctrl+H, then enable regex with Alt+R.
# Sublime Text replacement syntax:
# $1, $2, ... — numbered groups (or \1, \2)
# ${1}, ${2} — groups in braces (avoids ambiguity)
# \U, \L, \u, \l, \E — case modifiers
# Example: Convert function declarations to arrow functions
# Find: function\s+(\w+)\s*\(([^)]*)\)\s*\{
# Replace: const $1 = ($2) => {
# When group reference is followed by a digit:
# Find: (\w+)(\d)
# Replace: ${1}_${2} (braces prevent ambiguity)
JetBrains IDEs (IntelliJ, WebStorm, PyCharm)
JetBrains IDEs use Java regex syntax. Open Find and Replace with Ctrl+R, enable regex with Alt+X.
# JetBrains replacement syntax:
# $1, $2, ... — numbered capture groups
# ${name} — named capture groups
# \l, \u — case conversion (next char)
# \L, \U, \E — case conversion (range)
# JetBrains structural search is even more powerful:
# Find (structural): $Instance$.assertEquals($Expected$, $Actual$)
# Replace (structural): assertThat($Actual$).isEqualTo($Expected$)
JetBrains IDEs also offer "Structural Search and Replace" which goes beyond regex to understand the syntax tree of your code. This can match patterns like "any method call on a variable of type X" which is impossible with pure regex.
Regex Replace on the Command Line
The command line is where regex replace reaches its full potential. You can process files of any size, pipe commands together, and script replacements for automation.
sed (Stream Editor)
sed is the classic Unix tool for stream editing. Its substitute command s/pattern/replacement/flags is one of the most commonly used commands in shell scripting.
# Basic syntax:
sed 's/pattern/replacement/flags' input.txt
# Replace first occurrence on each line:
sed 's/foo/bar/' file.txt
# Replace ALL occurrences on each line (g flag):
sed 's/foo/bar/g' file.txt
# Case-insensitive replace (GNU sed):
sed 's/foo/bar/gi' file.txt
# Edit file in place (-i flag):
sed -i 's/foo/bar/g' file.txt
# macOS sed requires empty string with -i:
sed -i '' 's/foo/bar/g' file.txt
# Using capture groups (note: \( \) for basic regex, ( ) with -E):
sed -E 's/([0-9]{2})\/([0-9]{2})\/([0-9]{4})/\3-\1-\2/g' dates.txt
# Replace only on lines matching a pattern:
sed '/^#/!s/localhost/0.0.0.0/g' config.txt
# Multiple replacements in one pass:
sed -E 's/width: ([0-9]+)px/width: \1rem/g; s/height: ([0-9]+)px/height: \1rem/g' styles.css
# Delete lines matching a pattern (not replace, but commonly needed):
sed '/^$/d' file.txt # Delete empty lines
sed '/^#/d' file.txt # Delete comment lines
sed '/DEBUG/d' log.txt # Delete debug lines
Important sed note: By default, sed uses POSIX Basic Regular Expressions (BRE), where ( ) + ? { } need backslash escaping. Use sed -E (or sed -r on older GNU systems) for Extended Regular Expressions (ERE), which match the syntax most developers expect.
Perl One-Liners
When sed is not powerful enough, Perl one-liners give you the full power of Perl-compatible regex (PCRE):
# Perl one-liner syntax:
perl -pe 's/pattern/replacement/flags' file.txt
# Edit in place:
perl -i -pe 's/pattern/replacement/g' file.txt
# Perl supports lookahead, lookbehind, and \K:
perl -pe 's/(?<=price: )\d+/99/g' file.txt
# Use code in replacement with /e flag:
perl -pe 's/(\d+)/sprintf("%05d", $1)/ge' file.txt
# Named captures:
perl -pe 's/(?<first>\w+) (?<last>\w+)/$+{last}, $+{first}/g' names.txt
# Multi-line matching:
perl -0777 -pe 's/start.*?end/REDACTED/gs' file.txt
awk
awk is less commonly used for pure find-and-replace but excels when you need to combine regex matching with field-based processing:
# Replace using gsub (global substitution):
awk '{ gsub(/pattern/, "replacement"); print }' file.txt
# Replace only in specific fields:
awk -F',' '{ gsub(/old/, "new", $3); print }' OFS=',' data.csv
# Conditional replacement:
awk '/^ERROR/ { gsub(/timestamp=[0-9]+/, "timestamp=REDACTED") } { print }' log.txt
# Using match groups (GNU awk with match()):
gawk '{ match($0, /name="([^"]+)"/, arr); if (arr[1]) print arr[1] }' file.xml
grep with Context (Finding Before Replacing)
Before running a replacement, it is good practice to preview what will be matched. grep (or rg, ripgrep) helps you verify your pattern:
# Preview matches before replacing:
grep -Pn 'pattern' file.txt
# Show context around matches:
grep -Pn -C 2 'pattern' file.txt
# Count matches:
grep -Pc 'pattern' file.txt
# Find files containing pattern (then replace):
grep -Prl 'pattern' src/ | xargs sed -i 's/pattern/replacement/g'
# ripgrep (faster alternative):
rg 'pattern' --files-with-matches | xargs sed -i 's/pattern/replacement/g'
Regex Replace in Programming Languages
JavaScript: String.replace() and String.replaceAll()
JavaScript provides two string methods for regex replacement. Understanding their differences is crucial.
// String.replace() — replaces FIRST match (unless /g flag used)
const text = "foo bar foo baz foo";
text.replace(/foo/, "qux"); // "qux bar foo baz foo"
text.replace(/foo/g, "qux"); // "qux bar qux baz qux"
// String.replaceAll() — replaces ALL matches (requires /g with regex)
text.replaceAll("foo", "qux"); // "qux bar qux baz qux" (string arg)
text.replaceAll(/foo/g, "qux"); // "qux bar qux baz qux" (regex must have /g)
// Capture groups in replacement:
"John Smith".replace(/(\w+) (\w+)/, "$2, $1");
// "Smith, John"
// Named capture groups:
"2026-02-11".replace(
/(?<year>\d{4})-(?<month>\d{2})-(?<day>\d{2})/,
"$<month>/$<day>/$<year>"
);
// "02/11/2026"
// Function as replacement (the real power):
"hello world".replace(/\b\w/g, (match) => match.toUpperCase());
// "Hello World"
// Function with capture groups:
"42px 16px 8px".replace(
/(\d+)px/g,
(match, value) => `${parseInt(value) / 16}rem`
);
// "2.625rem 1rem 0.5rem"
// Full function signature:
"abc".replace(/(a)(b)(c)/, (match, p1, p2, p3, offset, string) => {
// match = "abc" (full match)
// p1 = "a" (group 1)
// p2 = "b" (group 2)
// p3 = "c" (group 3)
// offset = 0 (position in string)
// string = "abc" (original string)
return `${p3}${p2}${p1}`;
});
// "cba"
The function replacement form is what sets JavaScript apart. You can run arbitrary logic for each match, including async operations (with replaceAll and promises in newer engines), API lookups, or complex transformations that are impossible with static replacement strings.
// Advanced: process each match with full logic
const template = "Hello {{name}}, you have {{count}} messages.";
const data = { name: "Alice", count: 5 };
template.replace(/\{\{(\w+)\}\}/g, (match, key) => {
return data[key] !== undefined ? data[key] : match;
});
// "Hello Alice, you have 5 messages."
// Sanitize HTML entities:
function escapeHTML(str) {
return str.replace(/[&<>"']/g, (ch) => ({
'&': '&',
'<': '<',
'>': '>',
'"': '"',
"'": '''
}[ch]));
}
// Slugify a title:
function slugify(str) {
return str
.toLowerCase()
.replace(/[^\w\s-]/g, '') // remove non-word chars
.replace(/\s+/g, '-') // spaces to hyphens
.replace(/-+/g, '-') // collapse multiple hyphens
.replace(/^-|-$/g, ''); // trim leading/trailing hyphens
}
Python: re.sub()
Python's re.sub() is the primary tool for regex replacement. It supports both string replacements and function replacements.
import re
# Basic replacement:
re.sub(r'pattern', 'replacement', text)
re.sub(r'pattern', 'replacement', text, count=1) # replace only first match
# Capture group references (use \1 or \g<1>):
re.sub(r'(\w+) (\w+)', r'\2, \1', 'John Smith')
# 'Smith, John'
# Named groups (use \g<name>):
re.sub(
r'(?P<year>\d{4})-(?P<month>\d{2})-(?P<day>\d{2})',
r'\g<month>/\g<day>/\g<year>',
'2026-02-11'
)
# '02/11/2026'
# IMPORTANT: Use raw strings (r'...') for both pattern AND replacement
# Without raw string, \1 is interpreted as an escape character
re.sub(r'(\d+)', r'[\1]', 'item 42') # Correct: 'item [42]'
re.sub('(\\d+)', '[\\1]', 'item 42') # Also works but harder to read
# Function replacement:
def title_case(match):
return match.group(0).capitalize()
re.sub(r'\b\w+', title_case, 'hello world')
# 'Hello World'
# Lambda replacement:
re.sub(r'\d+', lambda m: str(int(m.group()) * 2), 'price: 50, tax: 8')
# 'price: 100, tax: 16'
# Using match object in function:
def format_phone(match):
return f"({match.group(1)}) {match.group(2)}-{match.group(3)}"
re.sub(r'(\d{3})(\d{3})(\d{4})', format_phone, '5551234567')
# '(555) 123-4567'
# Compiled patterns (better performance for repeated use):
pattern = re.compile(r'(\d{4})-(\d{2})-(\d{2})')
pattern.sub(r'\2/\3/\1', '2026-02-11 and 2026-03-15')
# '02/11/2026 and 03/15/2026'
# re.subn() returns count of replacements:
result, count = re.subn(r'foo', 'bar', 'foo foo foo')
# result = 'bar bar bar', count = 3
# Flags:
re.sub(r'hello', 'hi', 'Hello World', flags=re.IGNORECASE)
# 'hi World'
re.sub(r'^#.*$', '', multiline_text, flags=re.MULTILINE)
# Removes comment lines
Practical Recipes: 20+ Real-World Regex Replacements
Here are battle-tested regex replacement recipes for tasks developers encounter regularly. Each recipe includes the pattern, replacement, and examples in multiple tools.
1. Convert Date Formats (MM/DD/YYYY to YYYY-MM-DD)
# Pattern: (\d{2})/(\d{2})/(\d{4})
# Replace: $3-$1-$2
# sed:
sed -E 's|([0-9]{2})/([0-9]{2})/([0-9]{4})|\3-\1-\2|g' dates.txt
# JavaScript:
text.replace(/(\d{2})\/(\d{2})\/(\d{4})/g, '$3-$1-$2');
# Python:
re.sub(r'(\d{2})/(\d{2})/(\d{4})', r'\3-\1-\2', text)
# VS Code: Find (\d{2})/(\d{2})/(\d{4}) Replace $3-$1-$2
2. Rename Variables: camelCase to snake_case
# This requires multiple passes or a function replacement
# JavaScript (handles multi-hump camels):
text.replace(/([a-z])([A-Z])/g, '$1_$2').toLowerCase();
// "getUserName" -> "get_user_name"
# Python:
re.sub(r'([a-z])([A-Z])', r'\1_\2', text).lower()
# sed (two-step):
sed -E 's/([a-z])([A-Z])/\1_\L\2/g' file.txt
# VS Code: Find ([a-z])([A-Z]) Replace $1_\l$2
# (Run multiple times for words with 3+ humps, or use a script)
3. Rename Variables: snake_case to camelCase
# JavaScript:
text.replace(/_([a-z])/g, (match, letter) => letter.toUpperCase());
// "get_user_name" -> "getUserName"
# Python:
re.sub(r'_([a-z])', lambda m: m.group(1).upper(), text)
# VS Code: Find _([a-z]) Replace \u$1
4. Extract and Replace URLs
# Convert plain URLs to Markdown links:
# Pattern: (https?://[^\s<>"]+)
# Replace: [$1]($1)
# JavaScript:
text.replace(/(https?:\/\/[^\s<>"]+)/g, '[$1]($1)');
# Python:
re.sub(r'(https?://[^\s<>"]+)', r'[\1](\1)', text)
# Convert Markdown links to plain URLs:
# Pattern: \[([^\]]+)\]\(([^)]+)\)
# Replace: $2
text.replace(/\[([^\]]+)\]\(([^)]+)\)/g, '$2');
5. Extract and Replace Email Addresses
# Redact email addresses:
# Pattern: [\w.+-]+@[\w-]+\.[\w.-]+
# Replace: [REDACTED]
# Mask the username part:
# Pattern: ([\w.+-]+)(@[\w-]+\.[\w.-]+)
# Replace: ***$2
# JavaScript:
text.replace(/([\w.+-]+)(@[\w-]+\.[\w.-]+)/g, '***$2');
// "alice@example.com" -> "***@example.com"
# Python:
re.sub(r'([\w.+-]+)(@[\w-]+\.[\w.-]+)', r'***\2', text)
6. Clean HTML Tags
# Remove ALL HTML tags:
# Pattern: <[^>]+>
# Replace: (empty string)
# JavaScript:
html.replace(/<[^>]+>/g, '');
# Remove specific tags but keep content:
# Remove <b> and </b> but keep the text inside:
html.replace(/<\/?b>/g, '');
# Remove tags AND their content (e.g., <script>):
html.replace(/<script\b[^>]*>[\s\S]*?<\/script>/gi, '');
# Convert <br> tags to newlines:
html.replace(/<br\s*\/?>/gi, '\n');
# Strip attributes from tags (keep tags, remove attributes):
# Pattern: <(\w+)\s+[^>]*>
# Replace: <$1>
html.replace(/<(\w+)\s+[^>]*>/g, '<$1>');
Warning: Regex is not a reliable HTML parser for production use. These patterns work for simple cleanup tasks, but for complex HTML manipulation, use a proper DOM parser (like DOMParser in JavaScript or BeautifulSoup in Python).
7. Normalize Whitespace
# Collapse multiple spaces to single space:
text.replace(/ +/g, ' ');
# Collapse all whitespace (spaces, tabs, newlines) to single space:
text.replace(/\s+/g, ' ').trim();
# Remove trailing whitespace from each line:
text.replace(/[ \t]+$/gm, '');
# Remove leading whitespace from each line:
text.replace(/^[ \t]+/gm, '');
# Normalize line endings (CRLF to LF):
text.replace(/\r\n/g, '\n');
# Remove blank lines:
text.replace(/^\s*\n/gm, '');
# sed equivalents:
sed 's/ */ /g' file.txt # Collapse spaces
sed 's/[[:space:]]*$//' file.txt # Trim trailing whitespace
sed '/^$/d' file.txt # Remove blank lines
8. Add Line Numbers
# JavaScript:
let lineNum = 0;
text.replace(/^/gm, () => `${++lineNum}: `);
# Python:
lines = text.split('\n')
result = '\n'.join(f'{i+1}: {line}' for i, line in enumerate(lines))
# sed:
sed = '/^/=' file.txt | sed 'N;s/\n/: /'
# More practical: add line numbers to non-empty lines only
text.replace(/^(.+)$/gm, (match, line, offset) => {
const num = text.substring(0, offset).split('\n').length;
return `${num}: ${line}`;
});
9. Convert CSV to JSON
# For simple CSV (no quoted fields with commas):
# First line as headers, remaining lines as objects
# JavaScript approach:
function csvToJson(csv) {
const lines = csv.trim().split('\n');
const headers = lines[0].split(',').map(h => h.trim());
return lines.slice(1).map(line => {
const values = line.split(',').map(v => v.trim());
return Object.fromEntries(headers.map((h, i) => [h, values[i]]));
});
}
# For the regex-only approach (simple 3-column CSV):
# Pattern: ^(\w+),(\w+),(\w+)$
# Replace: {"col1": "$1", "col2": "$2", "col3": "$3"}
# sed (simple 3-column):
sed -E 's/^([^,]+),([^,]+),([^,]+)$/{"name": "\1", "age": "\2", "city": "\3"}/' data.csv
10. Format Phone Numbers
# Convert 10 digits to (XXX) XXX-XXXX:
# Pattern: (\d{3})(\d{3})(\d{4})
# Replace: ($1) $2-$3
# Handle various input formats:
# Pattern: \+?1?[-.\s]?\(?(\d{3})\)?[-.\s]?(\d{3})[-.\s]?(\d{4})
# Replace: ($1) $2-$3
# JavaScript:
phone.replace(/\+?1?[-.\s]?\(?(\d{3})\)?[-.\s]?(\d{3})[-.\s]?(\d{4})/, '($1) $2-$3');
// "15551234567" -> "(555) 123-4567"
// "555.123.4567" -> "(555) 123-4567"
// "(555) 123 4567" -> "(555) 123-4567"
11. Convert Pixels to Rem
# Pattern: (\d+)px
# Replace: calculated rem value (needs function)
# JavaScript:
css.replace(/(\d+)px/g, (match, px) => `${parseInt(px) / 16}rem`);
# Python:
re.sub(r'(\d+)px', lambda m: f'{int(m.group(1)) / 16}rem', css)
# Input: "font-size: 16px; margin: 32px 8px;"
# Output: "font-size: 1rem; margin: 2rem 0.5rem;"
12. Convert require() to import Statements
# Pattern: const (\w+) = require\(['"]([^'"]+)['"]\);?
# Replace: import $1 from '$2';
# JavaScript:
code.replace(
/const (\w+) = require\(['"]([^'"]+)['"]\);?/g,
"import $1 from '$2';"
);
# VS Code: works directly with the pattern above
# Destructured requires:
# const { a, b } = require('module'); -> import { a, b } from 'module';
# Pattern: const (\{[^}]+\}) = require\(['"]([^'"]+)['"]\);?
# Replace: import $1 from '$2';
13. Wrap Words in HTML Tags
# Wrap each word in a span:
# Pattern: (\b\w+\b)
# Replace: <span class="word">$1</span>
# Wrap specific terms in strong tags:
# Pattern: \b(regex|pattern|replace)\b
# Replace: <strong>$1</strong>
# Add target="_blank" to external links:
# Pattern: <a href="(https?://[^"]+)"
# Replace: <a href="$1" target="_blank" rel="noopener noreferrer"
14. Remove Code Comments
# Single-line JavaScript/C comments:
code.replace(/\/\/.*$/gm, '');
# Multi-line C-style comments:
code.replace(/\/\*[\s\S]*?\*\//g, '');
# Python/Shell comments:
code.replace(/#.*$/gm, '');
# Both single and multi-line:
code.replace(/\/\/.*$|\/\*[\s\S]*?\*\//gm, '');
# Be careful: this will also match URLs containing //
# More precise (skip strings and URLs):
# This is better handled with a proper parser
15. Reformat JSON Keys
# Convert JSON keys from camelCase to snake_case:
# Pattern: "(\w+)"(\s*):
# Combined with the camelCase conversion:
# JavaScript:
json.replace(/"(\w+)"(\s*):/g, (match, key, space) => {
const snakeKey = key.replace(/([a-z])([A-Z])/g, '$1_$2').toLowerCase();
return `"${snakeKey}"${space}:`;
});
// {"firstName": "John", "lastName": "Smith"}
// {"first_name": "John", "last_name": "Smith"}
16. Extract Domain from URLs
# Pattern: https?://([^/\s]+)
# Replace: $1
# JavaScript:
urls.replace(/https?:\/\/([^/\s]+)\S*/g, '$1');
# Get just the base domain (remove subdomains):
urls.replace(/https?:\/\/(?:[\w-]+\.)*?([\w-]+\.[\w]+)\S*/g, '$1');
17. Convert Markdown Bold/Italic to HTML
# Bold (**text**) to <strong>:
text.replace(/\*\*([^*]+)\*\*/g, '<strong>$1</strong>');
# Italic (*text*) to <em>:
text.replace(/\*([^*]+)\*/g, '<em>$1</em>');
# Code (`text`) to <code>:
text.replace(/`([^`]+)`/g, '<code>$1</code>');
# Headers (## text) to <h2>:
text.replace(/^## (.+)$/gm, '<h2>$1</h2>');
text.replace(/^### (.+)$/gm, '<h3>$1</h3>');
18. Mask Sensitive Data
# Mask credit card numbers (keep last 4):
# Pattern: \b(\d{4})\d{8}(\d{4})\b
# Replace: $1********$2
text.replace(/\b(\d{4})\d{8}(\d{4})\b/g, '$1********$2');
// "4111111111111111" -> "4111********1111"
# Mask API keys (keep first 4 and last 4):
text.replace(/\b([A-Za-z0-9]{4})[A-Za-z0-9]{20,}([A-Za-z0-9]{4})\b/g, '$1...$2');
# Mask IP addresses:
text.replace(/\b(\d{1,3})\.\d{1,3}\.\d{1,3}\.(\d{1,3})\b/g, '$1.xxx.xxx.$2');
19. Add Thousand Separators to Numbers
# JavaScript:
num.replace(/\B(?=(\d{3})+(?!\d))/g, ',');
// "1234567" -> "1,234,567"
# This uses a zero-width lookahead assertion:
# \B - not a word boundary (prevents adding comma at start)
# (?= - lookahead: followed by...
# (\d{3})+ - one or more groups of exactly 3 digits
# (?!\d) - NOT followed by another digit
# ) - end lookahead
# Python:
re.sub(r'\B(?=(\d{3})+(?!\d))', ',', '1234567')
# '1,234,567'
20. Convert Single-Line Objects to Multi-Line
# Expand JSON/JS objects to multi-line:
# Pattern: \{([^{}]+)\}
# Strategy: replace commas with comma+newline, then format
# JavaScript:
compact.replace(/\{([^{}]+)\}/g, (match, content) => {
const pairs = content.split(',').map(p => ' ' + p.trim());
return '{\n' + pairs.join(',\n') + '\n}';
});
# This turns:
# {name: "Alice", age: 30, city: "NYC"}
# Into:
# {
# name: "Alice",
# age: 30,
# city: "NYC"
# }
Advanced: Lookahead and Lookbehind in Replacements
Lookahead and lookbehind assertions let you match based on what comes before or after a position, without including that context in the match itself. This is incredibly useful in replacements because it lets you insert text at precise positions.
Positive Lookahead (?=...)
Matches a position followed by the specified pattern:
# Insert a space before every capital letter (except the first):
text.replace(/(?=[A-Z])/g, ' ').trim();
// "camelCaseVariable" -> "camel Case Variable"
# Better version (only between lower and upper):
text.replace(/([a-z])(?=[A-Z])/g, '$1 ');
// "camelCaseVariable" -> "camel Case Variable"
# Add commas to numbers (the classic example):
"1234567".replace(/\B(?=(\d{3})+$)/g, ',');
// "1,234,567"
Negative Lookahead (?!...)
Matches a position NOT followed by the specified pattern:
# Replace "foo" only when NOT followed by "bar":
text.replace(/foo(?!bar)/g, 'baz');
// "foobar foobaz foo" -> "foobar bazbaz baz"
# Replace ".js" extension but not ".json":
filenames.replace(/\.js(?!on)\b/g, '.ts');
// "app.js config.json utils.js" -> "app.ts config.json utils.ts"
Positive Lookbehind (?<=...)
Matches a position preceded by the specified pattern. Supported in JavaScript (ES2018+), Python, and Perl:
# Replace numbers that come after a dollar sign:
text.replace(/(?<=\$)\d+/g, '***');
// "Price: $100, Code: 200" -> "Price: $***, Code: 200"
# Add leading zeros to single-digit hours:
text.replace(/(?<=\b)(\d)(?=:\d{2})/g, '0$1');
// "9:30 AM" -> "09:30 AM"
# Python:
re.sub(r'(?<=\$)\d+', '***', 'Price: $100, Code: 200')
# 'Price: $***, Code: 200'
Negative Lookbehind (?<!...)
# Replace "port" but not when preceded by "air" or "re":
text.replace(/(?<!air|re)port\b/g, 'gateway');
// "airport report port" -> "airport report gateway"
# Replace numbers not preceded by a dot (integers only):
text.replace(/(?<!\.)\b\d+\b(?!\.)/g, '[INT]');
// "42 3.14 100 2.5" -> "[INT] 3.14 [INT] 2.5"
The \K Escape (Perl and PCRE)
\K resets the match start position, effectively acting as a variable-length lookbehind. It is supported in Perl, PCRE, and tools that use PCRE (like grep -P):
# Replace the value after "password:" (in Perl):
perl -pe 's/password:\s*\K\S+/REDACTED/g' config.txt
# "password: secret123" -> "password: REDACTED"
# \K means "forget everything matched so far"
# The text before \K is matched but not included in $&
Performance Considerations
Regex replacement is fast for most tasks, but certain patterns can cause catastrophic performance. Understanding these pitfalls saves you from replacing a fast operation with one that hangs for minutes.
Catastrophic Backtracking
The most common performance problem in regex is catastrophic backtracking, where the engine tries exponentially many paths through a pattern before failing:
# DANGEROUS pattern:
(a+)+b
# On input "aaaaaaaaaaaaaaaaac", the engine tries 2^n combinations
# before determining no match is possible
# Other problematic patterns:
(.*a){10} # exponential on non-matching input
(\w+\s*)+; # can backtrack on long strings without semicolons
(x+x+)+y # classic catastrophic backtracking example
# SAFE alternatives:
# Use atomic groups (?>...) or possessive quantifiers (++, *+) when available
# Use specific character classes instead of .
# Avoid nested quantifiers when possible
# Instead of (a+)+b, use:
a+b # if you don't need the grouping
(?>a+)+b # atomic group (PCRE/Java)
a++b # possessive quantifier (PCRE/Java)
Compiled vs. Inline Patterns
# Python: compile patterns used more than once
import re
# Slow in a loop (recompiles each iteration):
for line in lines:
result = re.sub(r'(\d{4})-(\d{2})-(\d{2})', r'\2/\3/\1', line)
# Fast (compile once, reuse):
date_pattern = re.compile(r'(\d{4})-(\d{2})-(\d{2})')
for line in lines:
result = date_pattern.sub(r'\2/\3/\1', line)
# JavaScript: RegExp objects are already compiled
const pattern = /(\d{4})-(\d{2})-(\d{2})/g;
// Reset lastIndex if reusing with /g flag:
pattern.lastIndex = 0;
Matching Strategy Tips
# 1. Be specific — avoid .* when you can use character classes
# Slow: <div class=".*">
# Fast: <div class="[^"]*">
# 2. Use non-greedy quantifiers when appropriate
# Greedy (may match too much): <b>.*</b>
# Non-greedy (minimal match): <b>.*?</b>
# 3. Anchor your patterns when possible
# Unanchored: \d{4}-\d{2}-\d{2} (scans entire string)
# Anchored: ^\d{4}-\d{2}-\d{2} (fails fast on non-matching lines)
# 4. Order alternations by frequency
# If "http" appears more often than "ftp":
# Better: (http|ftp)://
# Worse: (ftp|http)://
# 5. For large files, use stream processing
# Don't read entire file into memory:
sed 's/pattern/replacement/g' hugefile.txt > output.txt
# sed processes line by line, using minimal memory
Common Mistakes and How to Debug Them
Even experienced developers make these mistakes. Knowing the common pitfalls will save you hours of frustration.
Mistake 1: Forgetting to Escape Special Characters
# These characters have special meaning in regex:
# . * + ? ^ $ { } [ ] ( ) | \
# WRONG — matches any character, not a literal dot:
text.replace(/file.txt/g, 'document.txt');
// "file-txt" would also match!
# CORRECT:
text.replace(/file\.txt/g, 'document.txt');
# WRONG — trying to match literal parentheses:
text.replace(/(foo)/g, '[foo]'); // Captures "foo" in group 1
# CORRECT:
text.replace(/\(foo\)/g, '[foo]'); // Matches literal "(foo)"
Mistake 2: Greedy Matching Across Too Much Text
# Input: <b>hello</b> and <b>world</b>
# WRONG — greedy .* matches everything between first <b> and LAST </b>:
text.replace(/<b>(.*)<\/b>/g, '<strong>$1</strong>');
// Result: <strong>hello</b> and <b>world</strong> (BAD!)
# CORRECT — non-greedy .*? matches minimally:
text.replace(/<b>(.*?)<\/b>/g, '<strong>$1</strong>');
// Result: <strong>hello</strong> and <strong>world</strong>
# EVEN BETTER — use negated character class:
text.replace(/<b>([^<]*)<\/b>/g, '<strong>$1</strong>');
// Faster and more precise
Mistake 3: Missing the Global Flag
# JavaScript .replace() only replaces the FIRST match by default:
"foo bar foo".replace(/foo/, 'baz'); // "baz bar foo" — only first!
"foo bar foo".replace(/foo/g, 'baz'); // "baz bar baz" — all matches
# In sed, same issue:
echo "foo bar foo" | sed 's/foo/baz/' # "baz bar foo"
echo "foo bar foo" | sed 's/foo/baz/g' # "baz bar baz"
# Python re.sub() replaces all by default:
re.sub(r'foo', 'baz', 'foo bar foo') # "baz bar baz" (all matches)
re.sub(r'foo', 'baz', 'foo bar foo', count=1) # "baz bar foo" (first only)
Mistake 4: Wrong Backreference Syntax
# Different tools use different syntax:
# JavaScript/VS Code: $1, $2, $<name>
# Python: \1, \2, \g<1>, \g<name>
# sed: \1, \2
# Perl: $1, $2, $+{name}
# WRONG in Python:
re.sub(r'(\w+)', '$1-modified', text) # Literal "$1-modified"
# CORRECT in Python:
re.sub(r'(\w+)', r'\1-modified', text) # Uses capture group
# WRONG in sed (missing backslash):
sed 's/(\w+)/1/' file.txt # Literal "1"
# CORRECT in sed:
sed -E 's/(\w+)/\1/' file.txt # Uses capture group
Mistake 5: Not Accounting for Multi-Line Input
# By default, ^ and $ match start/end of the ENTIRE string
# Use the multiline flag (m) to match start/end of each LINE
# JavaScript:
text.replace(/^prefix/g, 'new'); // Only matches at string start
text.replace(/^prefix/gm, 'new'); // Matches at each line start
# Python:
re.sub(r'^prefix', 'new', text) # String start only
re.sub(r'^prefix', 'new', text, flags=re.MULTILINE) # Each line start
# Similarly, . does NOT match newlines by default:
text.replace(/start(.*)end/g, ''); // Won't cross lines
text.replace(/start([\s\S]*?)end/g, ''); // Crosses lines
# Or use the 's' (dotall) flag:
text.replace(/start(.*?)end/gs, ''); // . matches newlines with /s
Mistake 6: Replacement String Injection
# User input in replacement strings can be dangerous:
# JavaScript:
const userInput = "$` hacked $'";
text.replace(/safe/, userInput); // $` and $' are interpreted!
# Always escape replacement strings when using user input:
function escapeReplacement(str) {
return str.replace(/\$/g, '$$$$'); // $$ produces literal $
}
text.replace(/safe/, escapeReplacement(userInput));
# Python is safer by default — only \1 syntax is interpreted
# But still be cautious with \g<...> in user-supplied replacements
Debugging Strategies
# 1. Test on a small sample first
echo "test input" | sed 's/pattern/replacement/'
# 2. Use regex testers with step-by-step matching
# Our Regex Debugger shows exactly how the engine processes each character
# 3. Build patterns incrementally
# Start with a simple match, then add complexity:
\d # matches a digit
\d+ # matches one or more digits
\d{4} # matches exactly 4 digits
\d{4}-\d{2} # matches YYYY-MM
# 4. Use verbose/extended mode for complex patterns
# Python:
pattern = re.compile(r'''
(?P<year>\d{4}) # Year (4 digits)
- # Separator
(?P<month>\d{2}) # Month (2 digits)
- # Separator
(?P<day>\d{2}) # Day (2 digits)
''', re.VERBOSE)
# 5. Log intermediate results
# JavaScript:
text.replace(/pattern/g, (match, ...groups) => {
console.log('Match:', match, 'Groups:', groups);
return replacement;
});
Quick Reference: Replacement Syntax by Tool
| Feature | JavaScript | Python | sed | VS Code | Perl |
|---|---|---|---|---|---|
| Group ref | $1 |
\1 or \g<1> |
\1 |
$1 |
$1 |
| Named ref | $<name> |
\g<name> |
N/A | $<name> |
$+{name} |
| Full match | $& |
\g<0> |
& |
$0 |
$& |
| Literal $ | $$ |
\$ (not needed) |
\$ |
$$ |
\$ |
| Uppercase | Function only | Function only | \U (GNU) |
\U |
\U |
| Lowercase | Function only | Function only | \L (GNU) |
\L |
\L |
| Function replace | Yes | Yes | No | No | Yes (/e flag) |
| Global by default | No (need /g) | Yes | No (need /g) | Manual | No (need /g) |
Putting It All Together: A Real Workflow
Let us walk through a complete, realistic scenario. You have inherited a legacy JavaScript codebase that uses var declarations, old-style string concatenation, and CommonJS modules. You need to modernize it.
# Step 1: Convert var to const/let
# In VS Code, Find (regex):
\bvar\b
# Replace with:
const
# Then manually review and change some to 'let' where reassignment occurs
# Step 2: Convert string concatenation to template literals
# This one is tricky with pure regex, but handles simple cases:
# Find: "([^"]*)" \+ (\w+) \+ "([^"]*)"
# Replace: `$1${$2}$3`
# Step 3: Convert require to import
# Find: const (\w+) = require\('([^']+)'\);
# Replace: import $1 from '$2';
# Step 4: Convert module.exports to export default
# Find: module\.exports = (\w+);
# Replace: export default $1;
# Step 5: Convert function declarations to arrow functions
# Find: function (\w+)\(([^)]*)\) \{
# Replace: const $1 = ($2) => {
# Step 6: Verify nothing is broken
npm test
Each step uses a targeted regex replacement to handle one type of transformation. This incremental approach is safer than trying to do everything in a single complex pattern. Run your tests after each step to catch issues early.
Conclusion
Regex find and replace is a skill that pays dividends throughout your entire career. Once you understand capture groups, backreferences, and the replacement syntax for your tools, you can transform text with a precision and speed that no manual editing can match.
Start with the basics: simple patterns with $1 and $2 replacements in your code editor. Move to the command line with sed when you need to process files in bulk. Graduate to function replacements in JavaScript and Python when you need computed transformations. And always test your patterns on a small sample before applying them to your entire codebase.
The recipes in this guide cover the most common real-world scenarios, but the real power of regex replace comes from combining these techniques to solve your specific problems. Keep this guide bookmarked, practice the patterns, and regex replace will become one of the most valuable tools in your development workflow.