Mastering JavaScript Regular Expressions

A comprehensive guide to understanding and using regex patterns effectively

๐Ÿ” Introduction to Regular Expressions

A Regular Expression (Regex or RegExp) is a sequence of characters that forms a search pattern. These patterns are used for string searching algorithms, "find" or "find and replace" operations, and input validation.

Regular expressions are both powerful and flexible, allowing developers to:

  • Match specific character patterns within text
  • Extract information from strings
  • Validate user input against predefined patterns
  • Perform complex search and replace operations

Did You Know?

Regular expressions were invented by mathematician Stephen Cole Kleene in the 1950s as a way to describe regular languages. They were later incorporated into programming languages, with Perl being particularly influential in popularizing their use.

๐ŸŽฏ Why Use Regular Expressions?

Advantages

  • Conciseness: Express complex string operations in a compact form
  • Flexibility: Create patterns that can match a wide variety of text
  • Efficiency: Perform complex string operations with minimal code
  • Standardization: Regex syntax is similar across many programming languages

Common Use Cases

  • Form validation (emails, passwords, phone numbers)
  • Data extraction and scraping
  • Search and replace operations
  • Syntax highlighting in code editors
  • URL routing in web frameworks
  • Parsing and tokenizing text

๐Ÿ› ๏ธ Basic Syntax

In JavaScript, regular expressions can be created in two ways:

// Using literal notation (preferred for static patterns)
const regex1 = /pattern/flags;

// Using the RegExp constructor (useful when pattern is dynamic)
const regex2 = new RegExp("pattern", "flags");

Regex Flags

Flag Name Description Example
g Global Match all occurrences, not just the first one /abc/g
i Case-insensitive Match regardless of case /abc/i
m Multiline ^ and $ match start/end of each line /^abc/m
s Dotall . matches newlines too /a.c/s
u Unicode Treat pattern as Unicode /\u{1F600}/u
y Sticky Match only from lastIndex /abc/y

Basic Methods

Method Description Example Result
test() Tests if pattern exists in string /hello/.test("hello world") true
exec() Returns array of match information /h(e)llo/.exec("hello") ["hello", "e"]
match() String method that returns matches "hello".match(/l/g) ["l", "l"]
replace() String method that replaces matches "hello".replace(/l/g, "r") "herro"
search() String method that returns index of match "hello".search(/l/) 2
split() String method that splits by matches "a,b,c".split(/,/) ["a", "b", "c"]

โš™๏ธ Special Characters & Metacharacters

Metacharacters are characters with special meaning in regex:

Character Description Example Matches
. Any character except newline /h.t/ "hat", "hot", "h@t", etc.
^ Start of string /^hello/ "hello world" but not "say hello"
$ End of string /world$/ "hello world" but not "world view"
\d Digit (0-9) /\d{3}/ "123", "456", etc.
\w Word character (a-z, A-Z, 0-9, _) /\w+/ "hello", "test123", etc.
\s Whitespace /hello\sworld/ "hello world"
\D Non-digit /\D+/ "hello", "world", etc.
\W Non-word character /\W/ "!", "@", " ", etc.
\S Non-whitespace /\S+/ "hello", "world123", etc.
\b Word boundary /\bcat\b/ "cat" but not "category"
[ ] Character class /[aeiou]/ Any vowel
[^ ] Negated character class /[^0-9]/ Any non-digit
| Alternation (OR) /cat|dog/ "cat" or "dog"
( ) Grouping /(ab)+/ "ab", "abab", etc.

Watch Out!

To match literal special characters like ., *, +, etc., you need to escape them with a backslash: \., \*, \+

๐Ÿ”ข Quantifiers & Grouping

Quantifiers

Quantifiers specify how many instances of a character, group, or character class must be present for a match:

Quantifier Description Example Matches
* 0 or more /ab*c/ "ac", "abc", "abbc", etc.
+ 1 or more /ab+c/ "abc", "abbc", etc. (not "ac")
? 0 or 1 /colou?r/ "color" or "colour"
{n} Exactly n /\d{3}/ "123", "456", etc.
{n,} n or more /\d{2,}/ "12", "123", "1234", etc.
{n,m} Between n and m /\d{2,4}/ "12", "123", "1234" (not "1" or "12345")

Greedy vs. Lazy Matching

By default, quantifiers are greedy - they match as much as possible. Adding a ? after a quantifier makes it lazy - matching as little as possible.

// Greedy matching
"<div>Hello</div><div>World</div>".match(/<div>.*<\/div>/);
// Returns: "<div>Hello</div><div>World</div>"

// Lazy matching
"<div>Hello</div><div>World</div>".match(/<div>.*?<\/div>/);
// Returns: "<div>Hello</div>"

Capturing Groups

Parentheses ( ) create capturing groups that remember the matched text:

const regex = /(\d{3})-(\d{3})-(\d{4})/;
const match = "123-456-7890".match(regex);
// match[0] = "123-456-7890" (full match)
// match[1] = "123" (first group)
// match[2] = "456" (second group)
// match[3] = "7890" (third group)

Named Capturing Groups

You can name capturing groups for easier reference:

const regex = /(?<areaCode>\d{3})-(?<prefix>\d{3})-(?<lineNumber>\d{4})/;
const match = "123-456-7890".match(regex);
// match.groups.areaCode = "123"
// match.groups.prefix = "456"
// match.groups.lineNumber = "7890"

๐Ÿ“‹ Common Regex Patterns

Here are some commonly used regex patterns for various validation scenarios:

Pattern Type Regex Description Valid Examples Invalid Examples
Email (Simple) /^[\w.-]+@[\w.-]+\.\w+$/ Basic email validation user@example.com
john.doe@company.co.uk
user@.com
@example.com
Email (RFC 5322) /^[a-zA-Z0-9.!#$%&'*+/=?^_`{|}~-]+@[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?(?:\.[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?)*$/ Comprehensive email validation user+tag@example.com
first-last@domain.co.uk
user@domain@domain.com
user@-domain.com
Phone (US) /^(\+\d{1,2}\s)?$$?\d{3}$$?[\s.-]?\d{3}[\s.-]?\d{4}$/ US phone number with optional country code 123-456-7890
(123) 456-7890
+1 123 456 7890
123-45-6789
123-456-789
Phone (International) /^\+(?:[0-9] ?){6,14}[0-9]$/ International phone number +1 123 456 7890
+44 7911 123456
+123
1234567890
URL /^(https?:\/\/)?([\da-z.-]+)\.([a-z.]{2,6})([\/\w.-]*)*\/?$/ Web URL with optional protocol https://example.com
example.com/path
http:/example.com
https://example
Date (MM/DD/YYYY) /^(0[1-9]|1[0-2])\/(0[1-9]|[12][0-9]|3[01])\/(19|20)\d\d$/ Date in MM/DD/YYYY format 01/01/2023
12/31/2023
13/01/2023
01/32/2023
Date (YYYY-MM-DD) /^\d{4}-(0[1-9]|1[0-2])-(0[1-9]|[12][0-9]|3[01])$/ ISO date format 2023-01-01
2023-12-31
2023-13-01
2023-01-32
Strong Password /^(?=.*[a-z])(?=.*[A-Z])(?=.*\d)(?=.*[@$!%*?&])[A-Za-z\d@$!%*?&]{8,}$/ Min 8 chars, 1 uppercase, 1 lowercase, 1 number, 1 special char Password1!
Str0ng@Pass
password
Password
Password1
Username /^[a-zA-Z0-9_]{3,16}$/ 3-16 characters, letters, numbers, underscore user123
john_doe
u$er
user-name
IP Address /^(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)$/ IPv4 address 192.168.1.1
255.255.255.255
256.0.0.1
192.168.1
Postal Code (US) /^\d{5}(-\d{4})?$/ US ZIP code with optional +4 12345
12345-6789
1234
12345-67
Credit Card /^(?:4[0-9]{12}(?:[0-9]{3})?|5[1-5][0-9]{14}|3[47][0-9]{13}|6(?:011|5[0-9]{2})[0-9]{12})$/ Major credit card numbers 4111111111111111
5555555555554444
41111111111111
3111111111111111
Hex Color /^#?([a-fA-F0-9]{6}|[a-fA-F0-9]{3})$/ Hex color code with optional # #fff
#f0f0f0
#ffff
#xyz

๐Ÿงช Interactive Examples

Test Your Regex

Try out different regex patterns and see the results in real-time:

Results:

Click "Test Regex" to see results

Common Validation Examples

Email Validation

Password Strength

Phone Number

โœจ Real-World Applications

Form Validation Example

Letters and spaces only, minimum 3 characters

Valid email format required

Format: XXX-XXX-XXXX

Min 8 chars, including uppercase, lowercase, number, and special character

Valid URL format

// Form validation with regex
document.getElementById('registrationForm').addEventListener('submit', function(e) {
  e.preventDefault();
  
  const name = document.getElementById('fullName').value;
  const email = document.getElementById('userEmail').value;
  const phone = document.getElementById('userPhone').value;
  const password = document.getElementById('userPassword').value;
  const website = document.getElementById('website').value;
  
  // Name: letters and spaces only, min 3 chars
  const nameRegex = /^[A-Za-z\s]{3,}$/;
  
  // Email: standard format validation
  const emailRegex = /^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$/;
  
  // Phone: XXX-XXX-XXXX format
  const phoneRegex = /^\d{3}-\d{3}-\d{4}$/;
  
  // Password: min 8 chars, 1 uppercase, 1 lowercase, 1 number, 1 special char
  const passwordRegex = /^(?=.*[a-z])(?=.*[A-Z])(?=.*\d)(?=.*[@$!%*?&])[A-Za-z\d@$!%*?&]{8,}$/;
  
  // Website: optional, but must be valid if provided
  const websiteRegex = /^(https?:\/\/)?([\da-z.-]+)\.([a-z.]{2,6})([\/\w.-]*)*\/?$/;
  
  let isValid = true;
  let errorMessage = '';
  
  if (!nameRegex.test(name)) {
    errorMessage += 'Name must contain only letters and spaces (min 3 characters).\n';
    isValid = false;
  }
  
  if (!emailRegex.test(email)) {
    errorMessage += 'Please enter a valid email address.\n';
    isValid = false;
  }
  
  if (!phoneRegex.test(phone)) {
    errorMessage += 'Phone must be in XXX-XXX-XXXX format.\n';
    isValid = false;
  }
  
  if (!passwordRegex.test(password)) {
    errorMessage += 'Password must be at least 8 characters and include uppercase, lowercase, number, and special character.\n';
    isValid = false;
  }
  
  if (website && !websiteRegex.test(website)) {
    errorMessage += 'Please enter a valid website URL.\n';
    isValid = false;
  }
  
  if (isValid) {
    alert('Registration successful!');
    this.reset();
  } else {
    alert('Please fix the following errors:\n\n' + errorMessage);
  }
});

Text Processing Example

HTML Tag Extractor

Extracted Tags:

document.getElementById('extractTagsBtn').addEventListener('click', function() {
  const html = document.getElementById('htmlInput').value;
  const tagRegex = /<(\w+)(?:\s+[^>]*)?>/g;
  const matches = [...html.matchAll(tagRegex)];
  
  const tagsContainer = document.getElementById('extractedTags');
  tagsContainer.innerHTML = '';
  
  if (matches.length === 0) {
    tagsContainer.innerHTML = 'No tags found';
    return;
  }
  
  const tagCounts = {};
  
  matches.forEach(match => {
    const tag = match[1];
    tagCounts[tag] = (tagCounts[tag] || 0) + 1;
  });
  
  const table = document.createElement('table');
  table.innerHTML = `
    
      
        Tag
        Count
      
    
    
      ${Object.entries(tagCounts).map(([tag, count]) => `
        
          <${tag}>
          ${count}
        
      `).join('')}
    
  `;
  
  tagsContainer.appendChild(table);
});

๐Ÿ“ Regex Cheat Sheet

Character Classes

. Any character except newline
\w Word character [a-zA-Z0-9_]
\d Digit [0-9]
\s Whitespace character
\W Non-word character
\D Non-digit
\S Non-whitespace character
[abc] Any of a, b, or c
[^abc] Not a, b, or c
[a-z] Any character a through z

Anchors

^ Start of string or line
$ End of string or line
\b Word boundary
\B Non-word boundary

Escaped Characters

\. Literal period
\+ Literal plus sign
\* Literal asterisk
\? Literal question mark
\\ Literal backslash

Quantifiers

* 0 or more
+ 1 or more
? 0 or 1
{3} Exactly 3
{3,} 3 or more
{1,3} Between 1 and 3
*? 0 or more (lazy)
+? 1 or more (lazy)

Groups & Lookarounds

(abc) Capture group
(?:abc) Non-capturing group
(?<name>abc) Named capture group
(?=abc) Positive lookahead
(?!abc) Negative lookahead
(?<=abc) Positive lookbehind
(?<!abc) Negative lookbehind

Pro Tips

  • Use non-capturing groups (?:...) when you don't need to reference the group later
  • Use lookaheads/lookbehinds for complex validation without including the matched text
  • Remember that .* is greedy - use .*? for non-greedy matching
  • Test your regex with various inputs to ensure it handles edge cases
  • Break complex patterns into smaller, more manageable pieces

๐Ÿ“š Learning Resources

Online Tools

Documentation & Tutorials

Books

  • "Mastering Regular Expressions" by Jeffrey Friedl
  • "Regular Expressions Cookbook" by Jan Goyvaerts & Steven Levithan
  • "Regular Expression Pocket Reference" by Tony Stubblebine