Python RegEx Tester

Search...

⌘K

Python RegEx Tester

Search...

⌘K


Python RegEx Tester

Test and debug Python regular expressions online with the Qodex Python Regex Tester. Instantly highlight matches and refine patterns used for email validation, password checks, phone number validation, and more. Use data from the Email Generator or UUID Generator, and compare behavior with the Java or JavaScript Regex Testers.

john@qodex.ai
Possible security issues
This regex appears to be safe.
Explanation
  • [A-Z]: uppercase letters
  • [a-z]: lowercase letters
  • [0-9]: digits
  • \.: a literal dot
  • +: one or more of the preceding
  • *: zero or more of the preceding
  • ?: optional (zero or one)
  • ^: start of string
  • $: end of string
Test your APIs today!

Write in plain English — Qodex turns it into secure, ready-to-run tests.

Regular Expression - Documentation

What is Python Regex?

Python uses the built-in re module to support regular expressions. Regex allows you to match, extract, and transform text using patterns. It’s widely used in:


  • Data validation (e.g., email, password, phone numbers)

  • Text processing and cleanup

  • Web scraping and log analysis

  • Extracting patterns from strings


Core Components of Python RegEx

  • Real-Time Matching – Immediate pattern matching and highlight as you type.

  • Supports Python re Syntax – Works exactly like Python’s regex engine.

  • Capturing Groups Displayed – Shows capture groups and matches.

  • Beginner-Friendly – Just paste your regex and test string—no coding required.

  • Integrates with Test Tools – Try with Address Generator, Password Generator, or MAC Address Generator.


Metacharacters

  • . - Matches any character except newline (\n).

    Example: a.b matches acb, a9b, etc., but not ab.

  • ^ - Matches the start of a string.

    Example: ^Hello matches “Hello world” but not “Say Hello”.

  • $ - Matches the end of a string or just before the newline at the end.

    Example: world$ matches “Hello world” but not “world peace”.

  • | - Acts as a logical OR operator.

    Example: cat|dog matches either “cat” or “dog”.


Character Classes

  • [abc] - Matches any one of a, b, or c.

    Example: gr[ae]y matches both “gray” and “grey”.

  • [^abc] - Negates the set. Matches any character except a, b, or c.

    Example: [^0-9] matches any non-digit.

  • [a-zA-Z] - Matches any alphabet character.

    Example: [A-Z] matches uppercase letters only.


Predefined Character Classes

  • \d : Matches any digit character; equivalent to [0-9].

  • \D : Matches any non-digit character.

  • \s : Matches any whitespace character: space, tab, newline, etc.

  • \S : Matches any non-whitespace character.

  • \w : Matches any word character: [a-zA-Z0-9_].

  • \W : Matches any character not considered a word character.


Quantifiers

  • * - Matches 0 or more repetitions of the preceding pattern.

    Example: ab* matches “a”, “ab”, “abb”, “abbb”…

  • + - Matches 1 or more occurrences.

    Example: ab+ matches “ab”, “abb”, “abbb”… but not “a”.

  • ? - Matches 0 or 1 occurrence, making it optional.

    Example: ab? matches “a” or “ab”.

  • {n} - Exactly n occurrences.

    Example: a{3} matches “aaa”.

  • {n,} - At least n occurrences.

    Example: a{2,} matches “aa”, “aaa”, “aaaa”…

  • {n,m} - Between n and m occurrences.

    Example: a{2,4} matches “aa”, “aaa”, or “aaaa”.


Groups

  • (abc) : Capturing group that matches “abc” and stores it.

    Example: (ha)+ matches “ha”, “hahaha”, etc.


  • (?:abc) : Non-capturing group; groups without saving.

    Useful when applying quantifiers or alternations without backreferences.


Why Groups Matter

Grouping with parentheses serves more than just matching patterns—it lets you extract and reuse specific parts of your match, known as capturing groups. This is especially handy for:

  • Extracting key-value pairs from structured text (like )

  • Parsing dates, times, or measurements from logs or forms

  • Pulling out parts of a URL, email address, or file name
    (e.g., getting the domain from or the file extension from )


You can access these captured groups using functions like or in Python, which return the matched parts as tuples or lists for easy processing. Non-capturing groups let you control pattern logic (alternation, quantifiers) without storing the matched text, which keeps your results tidy when you don’t need everything saved.


Lookaround and lookbehind

Lookahead and lookbehind are part of what’s called zero-width assertions in regular expressions. That’s a fancy way of saying: they let you match stuff based on what comes before or after—without including those surrounding characters in the match itself. In other words, you can check the “context” of your match (what’s nearby), but only capture exactly what you want. Think of them as secret agents, sneaking a peek ahead or behind without leaving a trace.

These are super handy when you want to ensure a match only happens in a specific context but don’t want to actually include that context in your result. For example, grabbing a number only if it’s followed by “px” (like in CSS values), or pulling out a username only if it’s after an “@” symbol.

  • (?=abc) : Positive lookahead; matches if abc follows.

    Example: \d(?=px) matches a digit followed by “px”.


  • (?!abc) : Negative lookahead; matches if abc does not follow.

    Example: \d(?!px) matches digits not followed by “px”.


  • (?<=abc) : Positive lookbehind; matches if preceded by abc.

    Example: (?<=@)\w+ matches text after “@” in an email.


  • (?<!abc) : Negative lookbehind; matches if not preceded by abc.


When Are Lookaheads and Lookbehinds Useful?

Lookaheads and lookbehinds shine when you need extra precision—matching text only when it appears in a certain context, but without scooping up the surrounding bits themselves.

Some handy scenarios include:

  • Extracting values with precise boundaries: For instance, pulling out numbers that come right after a specific symbol (like getting prices after a $ without including the dollar sign).

  • Filtering words by context: Grabbing instances of a word only if they’re followed (or not followed) by certain other words, such as finding “Java” only when it’s not followed by “Script.”

  • Capturing text between markers: Selecting text sandwiched between markers or delimiters—say, everything inside brackets—but leaving the brackets behind.

  • Validating complex passwords: Ensuring a string contains (or doesn't contain) required patterns—like at least one uppercase letter, but not allowing forbidden sequences.

With these advanced patterns, you get results that are laser-focused on your needs, whether scraping data, cleaning up logs, or sifting through text for just the right match.

These assertions don’t consume characters—they just check if the surrounding text fits the bill. Use them when you need to filter matches by what’s next door or just behind, while keeping your actual match clean and precise.


Understanding and mastering groups and lookarounds is essential for advanced text processing, data extraction, and building flexible regular expressions that can handle real-world data.


Anchors and Boundaries

  • \b : Word boundary (between \w and \W).

    Example: \bcat\b matches “cat” in “the cat sat” but not “catering”.


  • \B : Non-word boundary.

    Example: \Bend matches “bend” but not “end”.


  • \A : Matches the start of the string (unlike ^, it doesn’t change with re.MULTILINE).

  • \Z : Matches the end of the string or before the newline at the end.

  • \z : Matches the absolute end of the string (rare in Python, often replaced with \Z).


Flags

You can pass flags to functions like re.search() or use them inline with (?i), (?m), etc.

  • re.IGNORECASE / re.I : Case-insensitive matching; ignores the case of letters.

  • re.MULTILINE / re.M : ^ and $ match the start/end of each line, not just the whole string.

  • re.DOTALL / re.S : . The dot . matches any character, including newlines.

  • re.VERBOSE / re.X : Allows regex patterns to be split with whitespace and comments for clarity.

  • re.ASCII / re.A : Makes \w, \b, \d, \s, etc., match only ASCII characters.


Note:
Python does not support a global (g) flag like JavaScript, because functions like re.findall() and re.finditer() are already global—they return all matches by default.


Common Flag Modifiers:

You’ll often see these modifiers (sometimes called “input field modifiers” or “global flags”) used to tweak regex behavior:

  • g (Global): Apply the regex to find all matches in the string, not just the first.
    (Note: In Python, use or the method; the flag is used in other languages like JavaScript.)

  • i (Ignore case): Makes the pattern case-insensitive. In Python, use or the inline flag .

  • m (Multiline): Changes and to match the start/end of each line instead of the whole string. In Python, use or inline.

You can combine these inline:
For example, at the start of a pattern makes it case-insensitive and multiline.


Python Regular Expressions Examples


Example 1: Email Validation

Try the Email RegEx Python Validator and Email Generator to test this pattern interactively.

import re
email_pattern = re.compile(r'^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$')
email = "user@example.com"
print("Email Valid:", bool(email_pattern.match(email)))

Example 2: Password Strength Check

Use the Password RegEx Python Validator or generate test data with our Password Generator

password_pattern = re.compile(r'^(?=.*[a-z])(?=.*[A-Z])(?=.*\d)(?=.*[@$!%*?&])[A-Za-z\d@$!%*?&]{8,}$')
password = "Aa123456!"
print("Password Strong:", bool(password_pattern.match(password)))

Example 3: Extracting Words from a String

Useful for NLP, logs, or data pipelines

text = "Regex is #1 at pattern matching!"
word_pattern = re.compile(r'\b\w+\b')
for match in word_pattern.finditer(text):
    print("Found:", match.group())


Capturing Words That Start with an Uppercase Letter

Say you're faced with the task of picking out capitalized words—think names, places, or just the beginning of sentences. Regex has your back, again.


Start by defining your pattern. Something like does the trick:

  • ensures we match at word boundaries, so we're not grabbing mid-word uppercase letters.

  • looks for a single uppercase letter at the start.

  • picks up the rest of the word (letters, digits, and underscores).


Here's how you might use it:

cap_word_pattern = re.compile(r'\b([A-Z]\w*)\b') 
for match in cap_word_pattern.finditer(text): 
  print("Found:", match.group())

Run this on "Once upon a Time in New York", and you'll see it neatly pulls out "Once", "Time", "New", and "York"—every word that kicks off with a capital letter.


How It Works

  1. Enter your regex pattern and sample test string.

  2. View instant matches and capture groups below.

  3. Copy, edit, and refine your pattern until it’s perfect.

  4. Use test data from other tools to simulate real-world cases.


How can you replace matched patterns in a string using re.sub in Python?

Need to swap out parts of a string that match a certain pattern? The re.sub() function is your friend. Here’s how you can use it to replace all digits in a string with a placeholder:

import re

pattern = r'\d+'  # Matches sequences of digits
text = "There are 123 apples and 456 oranges."
replacement = "X"

# Substitute every sequence of digits with 'X'
result = re.sub(pattern, replacement, text)
print("Substitution result:", result)

What’s happening here?

  • The regular expression \d+ finds every group of one or more digits in the input.

  • Each match is replaced by "X", resulting in:
    There are X apples and X oranges.

re.sub() is handy for sanitizing input, anonymizing data (like replacing phone numbers or IDs), and reformatting text in logs, user inputs, and more. Don’t forget you can also use it with functions as the replacement, allowing even more flexible substitutions.


How do you search for patterns in strings using re.search and re.findall in Python?

To get started, you'll first need to import Python’s built-in module. Here’s a quick rundown of how you can use it to search for patterns in text:

python import re

pattern = r'\d+' # This pattern matches one or more digits string = "There are 123 apples and 456 oranges"

Search for the first match

search_result = re.search(pattern, string) 
  if search_result: print("Search result:", search_result.group())

Find all matches

findall_result = re.findall(pattern, string) print("Find all result:", findall_result)
  • scans through the string and returns the first match it finds.

  • returns a list of all matches found in the string.

With these functions, you can quickly locate and extract patterns—like numbers, words, or specific sequences—from text data. Whether you’re parsing log files, validating input, or just hunting for hidden gems in a string, regular expressions make the task efficient and flexible.


How do grouping and capturing work in Python regular expressions, and how can you extract specific parts of a match?

Groups and Lookarounds:

(abc) : Capturing group that matches “abc” and stores it.

Example: (ha)+ matches “ha”, “hahaha”, etc.

Capturing groups are powerful tools in regex for extracting and reusing specific parts of a matched string. Parentheses () not only group patterns, but also allow you to retrieve the exact segments you’re interested in. For example, suppose you want to extract a number-word pair from a string:

import re
pattern = r'(\d+)\s(\w+)'    # Matches "number word" pairs
string = "123 apples"
match = re.search(pattern, string)
if match:
    print("Full match:", match.group(0))      # "123 apples"
    print("First group:", match.group(1))     # "123"
    print("Second group:", match.group(2))    # "apples"


Breaking it down:

  • (\d+) : The first group captures one or more digits.

  • \s : Matches the space between the number and the word.

  • (\w+) : The second group captures one or more word characters (letters, digits, or underscores).

This approach is common for tasks like parsing dates ((\d{4})-(\d{2})-(\d{2})), extracting file extensions, or pulling out usernames from emails with patterns like (?<=@)\w+.

  • (?:abc) : Non-capturing group; groups without saving.

    Useful when applying quantifiers or alternations without backreferences.

Sometimes you just want to group parts of a pattern for logic, not for extraction. Use (?:...) when you don’t care about capturing.

  • (?=abc) : Positive lookahead; matches if abc follows.

    Example: \d(?=px) matches a digit followed by “px”.

  • (?!abc) : Negative lookahead; matches if abc does not follow.

    Example: \d(?!px) matches digits not followed by “px”.

  • (?<=abc) : Positive lookbehind; matches if preceded by abc.

    Example: (?<=@)\w+ matches text after “@” in an email.

  • (?<!abc) : Negative lookbehind; matches if not preceded by abc.


Lookarounds—both ahead and behind—let you match text only when it is (or isn't) next to some other text, without including that surrounding text in your match. This is handy when you want to filter matches based on context, like finding numbers only when they're not part of a specific measurement, or grabbing a username only after the @ symbol.


These grouping and lookaround mechanisms are essential for tasks such as:

  • Extracting key-value pairs from structured logs

  • Parsing measurements (e.g., finding numbers not followed by "px")

  • Splitting on delimiters but skipping empty fields

  • Extracting parts of URLs, emails, or filenames without including unwanted characters

Mastering groups and lookarounds will supercharge your regex skills, letting you surgically extract, rearrange, or validate text in ways plain text search never could.


Pro Tips for Writing Effective RegEx in Python

  • Compile regex using re.compile() for better performance in loops.

  • Use named groups ((?P<name>...)) for cleaner, readable code.

  • Use re.VERBOSE for large regex, allowing comments and spacing.

  • Leverage re.findall() to return all matches as a list.

  • Avoid regex for deeply nested or structured data—use parsers instead.

  • Use Qodex’s Python Regex Tester to experiment with edge cases and live data.

  • Try generating test data with our Email Generator, UUID Generator, or Password Generator.

  • Combine regex with list comprehensions for powerful one-liners.

  • When debugging, test sections of complex patterns separately before joining them.


Tools for Enhanced RegEx Workflows

Other Regex Validators

Generators for Testing

Encoders & Decoders

Explore More on Qodex

Use Qodex to test and validate patterns in multiple languages

Frequently asked questions

How do I write regex in Python?×
Use the re module with functions like re.search(), re.match(), or re.findall().
How do I make regex case-insensitive in Python?+
How do I validate an email with regex in Python?+
What’s the difference between match() and search()?+
Where can I test my Python regex patterns?+
Can I import Figma designs?+
Is it SEO-friendly?+
Can I collaborate with my team?+
Is hosting included?+
Can I export code?+
Is there a free plan?+
Can I use custom fonts?+

Python RegEx Tester

Search...

⌘K

Python RegEx Tester

Search...

⌘K


Python RegEx Tester

Python RegEx Tester

Test and debug Python regular expressions online with the Qodex Python Regex Tester. Instantly highlight matches and refine patterns used for email validation, password checks, phone number validation, and more. Use data from the Email Generator or UUID Generator, and compare behavior with the Java or JavaScript Regex Testers.

john@qodex.ai
Possible security issues
This regex appears to be safe.
Explanation
  • [A-Z]: uppercase letters
  • [a-z]: lowercase letters
  • [0-9]: digits
  • \.: a literal dot
  • +: one or more of the preceding
  • *: zero or more of the preceding
  • ?: optional (zero or one)
  • ^: start of string
  • $: end of string
Test your APIs today!

Write in plain English — Qodex turns it into secure, ready-to-run tests.

Python RegEx Tester - Documentation

What is Python Regex?

Python uses the built-in re module to support regular expressions. Regex allows you to match, extract, and transform text using patterns. It’s widely used in:


  • Data validation (e.g., email, password, phone numbers)

  • Text processing and cleanup

  • Web scraping and log analysis

  • Extracting patterns from strings


Core Components of Python RegEx

  • Real-Time Matching – Immediate pattern matching and highlight as you type.

  • Supports Python re Syntax – Works exactly like Python’s regex engine.

  • Capturing Groups Displayed – Shows capture groups and matches.

  • Beginner-Friendly – Just paste your regex and test string—no coding required.

  • Integrates with Test Tools – Try with Address Generator, Password Generator, or MAC Address Generator.


Metacharacters

  • . - Matches any character except newline (\n).

    Example: a.b matches acb, a9b, etc., but not ab.

  • ^ - Matches the start of a string.

    Example: ^Hello matches “Hello world” but not “Say Hello”.

  • $ - Matches the end of a string or just before the newline at the end.

    Example: world$ matches “Hello world” but not “world peace”.

  • | - Acts as a logical OR operator.

    Example: cat|dog matches either “cat” or “dog”.


Character Classes

  • [abc] - Matches any one of a, b, or c.

    Example: gr[ae]y matches both “gray” and “grey”.

  • [^abc] - Negates the set. Matches any character except a, b, or c.

    Example: [^0-9] matches any non-digit.

  • [a-zA-Z] - Matches any alphabet character.

    Example: [A-Z] matches uppercase letters only.


Predefined Character Classes

  • \d : Matches any digit character; equivalent to [0-9].

  • \D : Matches any non-digit character.

  • \s : Matches any whitespace character: space, tab, newline, etc.

  • \S : Matches any non-whitespace character.

  • \w : Matches any word character: [a-zA-Z0-9_].

  • \W : Matches any character not considered a word character.


Quantifiers

  • * - Matches 0 or more repetitions of the preceding pattern.

    Example: ab* matches “a”, “ab”, “abb”, “abbb”…

  • + - Matches 1 or more occurrences.

    Example: ab+ matches “ab”, “abb”, “abbb”… but not “a”.

  • ? - Matches 0 or 1 occurrence, making it optional.

    Example: ab? matches “a” or “ab”.

  • {n} - Exactly n occurrences.

    Example: a{3} matches “aaa”.

  • {n,} - At least n occurrences.

    Example: a{2,} matches “aa”, “aaa”, “aaaa”…

  • {n,m} - Between n and m occurrences.

    Example: a{2,4} matches “aa”, “aaa”, or “aaaa”.


Groups

  • (abc) : Capturing group that matches “abc” and stores it.

    Example: (ha)+ matches “ha”, “hahaha”, etc.


  • (?:abc) : Non-capturing group; groups without saving.

    Useful when applying quantifiers or alternations without backreferences.


Why Groups Matter

Grouping with parentheses serves more than just matching patterns—it lets you extract and reuse specific parts of your match, known as capturing groups. This is especially handy for:

  • Extracting key-value pairs from structured text (like )

  • Parsing dates, times, or measurements from logs or forms

  • Pulling out parts of a URL, email address, or file name
    (e.g., getting the domain from or the file extension from )


You can access these captured groups using functions like or in Python, which return the matched parts as tuples or lists for easy processing. Non-capturing groups let you control pattern logic (alternation, quantifiers) without storing the matched text, which keeps your results tidy when you don’t need everything saved.


Lookaround and lookbehind

Lookahead and lookbehind are part of what’s called zero-width assertions in regular expressions. That’s a fancy way of saying: they let you match stuff based on what comes before or after—without including those surrounding characters in the match itself. In other words, you can check the “context” of your match (what’s nearby), but only capture exactly what you want. Think of them as secret agents, sneaking a peek ahead or behind without leaving a trace.

These are super handy when you want to ensure a match only happens in a specific context but don’t want to actually include that context in your result. For example, grabbing a number only if it’s followed by “px” (like in CSS values), or pulling out a username only if it’s after an “@” symbol.

  • (?=abc) : Positive lookahead; matches if abc follows.

    Example: \d(?=px) matches a digit followed by “px”.


  • (?!abc) : Negative lookahead; matches if abc does not follow.

    Example: \d(?!px) matches digits not followed by “px”.


  • (?<=abc) : Positive lookbehind; matches if preceded by abc.

    Example: (?<=@)\w+ matches text after “@” in an email.


  • (?<!abc) : Negative lookbehind; matches if not preceded by abc.


When Are Lookaheads and Lookbehinds Useful?

Lookaheads and lookbehinds shine when you need extra precision—matching text only when it appears in a certain context, but without scooping up the surrounding bits themselves.

Some handy scenarios include:

  • Extracting values with precise boundaries: For instance, pulling out numbers that come right after a specific symbol (like getting prices after a $ without including the dollar sign).

  • Filtering words by context: Grabbing instances of a word only if they’re followed (or not followed) by certain other words, such as finding “Java” only when it’s not followed by “Script.”

  • Capturing text between markers: Selecting text sandwiched between markers or delimiters—say, everything inside brackets—but leaving the brackets behind.

  • Validating complex passwords: Ensuring a string contains (or doesn't contain) required patterns—like at least one uppercase letter, but not allowing forbidden sequences.

With these advanced patterns, you get results that are laser-focused on your needs, whether scraping data, cleaning up logs, or sifting through text for just the right match.

These assertions don’t consume characters—they just check if the surrounding text fits the bill. Use them when you need to filter matches by what’s next door or just behind, while keeping your actual match clean and precise.


Understanding and mastering groups and lookarounds is essential for advanced text processing, data extraction, and building flexible regular expressions that can handle real-world data.


Anchors and Boundaries

  • \b : Word boundary (between \w and \W).

    Example: \bcat\b matches “cat” in “the cat sat” but not “catering”.


  • \B : Non-word boundary.

    Example: \Bend matches “bend” but not “end”.


  • \A : Matches the start of the string (unlike ^, it doesn’t change with re.MULTILINE).

  • \Z : Matches the end of the string or before the newline at the end.

  • \z : Matches the absolute end of the string (rare in Python, often replaced with \Z).


Flags

You can pass flags to functions like re.search() or use them inline with (?i), (?m), etc.

  • re.IGNORECASE / re.I : Case-insensitive matching; ignores the case of letters.

  • re.MULTILINE / re.M : ^ and $ match the start/end of each line, not just the whole string.

  • re.DOTALL / re.S : . The dot . matches any character, including newlines.

  • re.VERBOSE / re.X : Allows regex patterns to be split with whitespace and comments for clarity.

  • re.ASCII / re.A : Makes \w, \b, \d, \s, etc., match only ASCII characters.


Note:
Python does not support a global (g) flag like JavaScript, because functions like re.findall() and re.finditer() are already global—they return all matches by default.


Common Flag Modifiers:

You’ll often see these modifiers (sometimes called “input field modifiers” or “global flags”) used to tweak regex behavior:

  • g (Global): Apply the regex to find all matches in the string, not just the first.
    (Note: In Python, use or the method; the flag is used in other languages like JavaScript.)

  • i (Ignore case): Makes the pattern case-insensitive. In Python, use or the inline flag .

  • m (Multiline): Changes and to match the start/end of each line instead of the whole string. In Python, use or inline.

You can combine these inline:
For example, at the start of a pattern makes it case-insensitive and multiline.


Python Regular Expressions Examples


Example 1: Email Validation

Try the Email RegEx Python Validator and Email Generator to test this pattern interactively.

import re
email_pattern = re.compile(r'^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$')
email = "user@example.com"
print("Email Valid:", bool(email_pattern.match(email)))

Example 2: Password Strength Check

Use the Password RegEx Python Validator or generate test data with our Password Generator

password_pattern = re.compile(r'^(?=.*[a-z])(?=.*[A-Z])(?=.*\d)(?=.*[@$!%*?&])[A-Za-z\d@$!%*?&]{8,}$')
password = "Aa123456!"
print("Password Strong:", bool(password_pattern.match(password)))

Example 3: Extracting Words from a String

Useful for NLP, logs, or data pipelines

text = "Regex is #1 at pattern matching!"
word_pattern = re.compile(r'\b\w+\b')
for match in word_pattern.finditer(text):
    print("Found:", match.group())


Capturing Words That Start with an Uppercase Letter

Say you're faced with the task of picking out capitalized words—think names, places, or just the beginning of sentences. Regex has your back, again.


Start by defining your pattern. Something like does the trick:

  • ensures we match at word boundaries, so we're not grabbing mid-word uppercase letters.

  • looks for a single uppercase letter at the start.

  • picks up the rest of the word (letters, digits, and underscores).


Here's how you might use it:

cap_word_pattern = re.compile(r'\b([A-Z]\w*)\b') 
for match in cap_word_pattern.finditer(text): 
  print("Found:", match.group())

Run this on "Once upon a Time in New York", and you'll see it neatly pulls out "Once", "Time", "New", and "York"—every word that kicks off with a capital letter.


How It Works

  1. Enter your regex pattern and sample test string.

  2. View instant matches and capture groups below.

  3. Copy, edit, and refine your pattern until it’s perfect.

  4. Use test data from other tools to simulate real-world cases.


How can you replace matched patterns in a string using re.sub in Python?

Need to swap out parts of a string that match a certain pattern? The re.sub() function is your friend. Here’s how you can use it to replace all digits in a string with a placeholder:

import re

pattern = r'\d+'  # Matches sequences of digits
text = "There are 123 apples and 456 oranges."
replacement = "X"

# Substitute every sequence of digits with 'X'
result = re.sub(pattern, replacement, text)
print("Substitution result:", result)

What’s happening here?

  • The regular expression \d+ finds every group of one or more digits in the input.

  • Each match is replaced by "X", resulting in:
    There are X apples and X oranges.

re.sub() is handy for sanitizing input, anonymizing data (like replacing phone numbers or IDs), and reformatting text in logs, user inputs, and more. Don’t forget you can also use it with functions as the replacement, allowing even more flexible substitutions.


How do you search for patterns in strings using re.search and re.findall in Python?

To get started, you'll first need to import Python’s built-in module. Here’s a quick rundown of how you can use it to search for patterns in text:

python import re

pattern = r'\d+' # This pattern matches one or more digits string = "There are 123 apples and 456 oranges"

Search for the first match

search_result = re.search(pattern, string) 
  if search_result: print("Search result:", search_result.group())

Find all matches

findall_result = re.findall(pattern, string) print("Find all result:", findall_result)
  • scans through the string and returns the first match it finds.

  • returns a list of all matches found in the string.

With these functions, you can quickly locate and extract patterns—like numbers, words, or specific sequences—from text data. Whether you’re parsing log files, validating input, or just hunting for hidden gems in a string, regular expressions make the task efficient and flexible.


How do grouping and capturing work in Python regular expressions, and how can you extract specific parts of a match?

Groups and Lookarounds:

(abc) : Capturing group that matches “abc” and stores it.

Example: (ha)+ matches “ha”, “hahaha”, etc.

Capturing groups are powerful tools in regex for extracting and reusing specific parts of a matched string. Parentheses () not only group patterns, but also allow you to retrieve the exact segments you’re interested in. For example, suppose you want to extract a number-word pair from a string:

import re
pattern = r'(\d+)\s(\w+)'    # Matches "number word" pairs
string = "123 apples"
match = re.search(pattern, string)
if match:
    print("Full match:", match.group(0))      # "123 apples"
    print("First group:", match.group(1))     # "123"
    print("Second group:", match.group(2))    # "apples"


Breaking it down:

  • (\d+) : The first group captures one or more digits.

  • \s : Matches the space between the number and the word.

  • (\w+) : The second group captures one or more word characters (letters, digits, or underscores).

This approach is common for tasks like parsing dates ((\d{4})-(\d{2})-(\d{2})), extracting file extensions, or pulling out usernames from emails with patterns like (?<=@)\w+.

  • (?:abc) : Non-capturing group; groups without saving.

    Useful when applying quantifiers or alternations without backreferences.

Sometimes you just want to group parts of a pattern for logic, not for extraction. Use (?:...) when you don’t care about capturing.

  • (?=abc) : Positive lookahead; matches if abc follows.

    Example: \d(?=px) matches a digit followed by “px”.

  • (?!abc) : Negative lookahead; matches if abc does not follow.

    Example: \d(?!px) matches digits not followed by “px”.

  • (?<=abc) : Positive lookbehind; matches if preceded by abc.

    Example: (?<=@)\w+ matches text after “@” in an email.

  • (?<!abc) : Negative lookbehind; matches if not preceded by abc.


Lookarounds—both ahead and behind—let you match text only when it is (or isn't) next to some other text, without including that surrounding text in your match. This is handy when you want to filter matches based on context, like finding numbers only when they're not part of a specific measurement, or grabbing a username only after the @ symbol.


These grouping and lookaround mechanisms are essential for tasks such as:

  • Extracting key-value pairs from structured logs

  • Parsing measurements (e.g., finding numbers not followed by "px")

  • Splitting on delimiters but skipping empty fields

  • Extracting parts of URLs, emails, or filenames without including unwanted characters

Mastering groups and lookarounds will supercharge your regex skills, letting you surgically extract, rearrange, or validate text in ways plain text search never could.


Pro Tips for Writing Effective RegEx in Python

  • Compile regex using re.compile() for better performance in loops.

  • Use named groups ((?P<name>...)) for cleaner, readable code.

  • Use re.VERBOSE for large regex, allowing comments and spacing.

  • Leverage re.findall() to return all matches as a list.

  • Avoid regex for deeply nested or structured data—use parsers instead.

  • Use Qodex’s Python Regex Tester to experiment with edge cases and live data.

  • Try generating test data with our Email Generator, UUID Generator, or Password Generator.

  • Combine regex with list comprehensions for powerful one-liners.

  • When debugging, test sections of complex patterns separately before joining them.


Tools for Enhanced RegEx Workflows

Other Regex Validators

Generators for Testing

Encoders & Decoders

Explore More on Qodex

Use Qodex to test and validate patterns in multiple languages

Frequently asked questions

How do I write regex in Python?×
Use the re module with functions like re.search(), re.match(), or re.findall().
How do I make regex case-insensitive in Python?+
How do I validate an email with regex in Python?+
What’s the difference between match() and search()?+
Where can I test my Python regex patterns?+
How do you escape special characters in regex patterns?+