Master Python Regular Expressions: re.match(), re.search(), re.findall() – Practical Examples
What Is a Regular Expression in Python?
A regular expression (regex) is a compact textual description that tells Python how to locate patterns within a string. Regexes are indispensable for data extraction, validation, and transformation across codebases, logs, configuration files, and web scraping projects.
In Python, the re module ships with the interpreter and exposes a rich API for compiling patterns, searching, and replacing text. Whether you’re filtering user input, parsing logs, or scraping content from websites, a solid grasp of regex fundamentals saves time and reduces bugs.
Core Regex Concepts
- Literal characters – match the exact text you write.
- Metacharacters – special symbols that describe classes, ranges, or quantifiers (e.g.,
\d,\w,+,?,*). - Anchors –
^and$pin the pattern to the start or end of a string. - Quantifiers – specify how many times a token may repeat.
- Groups & alternation – capture sub‑matches and allow multiple options.
- Flags – tweak pattern behavior (case‑insensitivity, multiline, etc.).
Common Identifier Table
| Identifier | Description | Whitespace | Special Escape |
|---|---|---|---|
| \d | Any digit (0‑9) | \n | ., +, *, ?, [], $, ^, (, ), {}, |, \ |
| \D | Non‑digit | \s | |
| \s | Space, tab, newline, etc. | \t | |
| \S | Non‑space | \e | |
| \w | Alphanumeric plus underscore | \r | |
| \W | Non‑alphanumeric (excluding underscore) | \f | |
| . | Any character except newline | — | |
| \b | Word boundary | — | |
| \. | Literal period | — | |
| \{x} | Exactly x occurrences | — |
Getting Started with re
import re
The re module is part of Python’s standard library and requires no external dependencies. Common use‑cases include:
- String validation (e.g., email addresses, phone numbers)
- Data extraction from unstructured text
- Automated web scraping
Example 1: \w+ and ^
^– anchor at the beginning of a string.\w+– one or more word characters (letters, digits, underscore).
Using these, we can capture the first word of a string:
import re sample = "guru99, education is fun" result = re.findall(r"^\w+", sample) print(result) # Output: ['guru99']
Removing the + would return only the first character: ["g"].
Example 2: Splitting on Whitespace with \s
The re.split() function can divide a string wherever a pattern matches. Using \s splits on any whitespace character.
import re text = "we are splitting the words" print(re.split(r"\s", text)) # ['we', 'are', 'splitting', 'the', 'words']
If the backslash is omitted, s is treated as a literal, resulting in splits at every s character.
Regex Methods Overview
re.match()– checks for a match only at the beginning of the string.re.search()– scans the entire string and returns the first match.re.findall()– returns a list of all non‑overlapping matches.
Using re.match()
import re text = "guru99 is a great platform" match = re.match(r"^\w+", text) print(match.group() if match else "No match") # Output: guru99
Using re.search()
import re
text = "Software Testing is fun"
for pattern in ["Software testing", "guru99"]:
found = re.search(pattern, text, re.IGNORECASE)
print(f"Looking for '{pattern}' in '{text}' - {'found a match!' if found else 'no match'}")
Using re.findall()
import re
emails = "guru99@google.com, careerguru99@hotmail.com, users@yahoomail.com"
found = re.findall(r"[\w\.-]+@[\w\.-]+", emails)
for e in found:
print(e)
Regex Flags
Flags modify pattern behavior. Common flags include:
| Flag | Description |
|---|---|
| re.M (re.MULTILINE) | Make ^ and $ match the start and end of each line. |
| re.I (re.IGNORECASE) | Ignore case distinctions. |
| re.S (re.DOTALL) | Make dot match newlines. |
| re.U (re.UNICODE) | Apply Unicode-aware character classes. |
| re.L (re.LOCALE) | Make classes depend on the current locale. |
| re.X (re.VERBOSE) | Allow whitespace and comments in the pattern. |
Multiline Example
import re text = """guru99 careerguru99 selenium""" print(re.findall(r"^\w", text)) # ['g'] print(re.findall(r"^\w", text, re.MULTILINE)) # ['g', 'c', 's']
Python 2 Compatibility
All examples above are written for Python 3. If you must run them under Python 2, replace print() statements with the Python 2 syntax and ensure that the re module is imported in the same way.
Takeaway
Mastering Python regex gives you a powerful toolkit for text processing:
- Use
re.match()for patterns anchored at the start. - Use
re.search()for a global search. - Use
re.findall()to extract all matches. - Apply flags (e.g.,
re.IGNORECASE,re.MULTILINE) to fine‑tune behavior. - Remember to test patterns with Regex101 or similar tools before deployment.
Python
- Python Regular Expressions (re Module) – A Practical Guide
- Python OOP Fundamentals: Classes, Objects, Inheritance, and Constructors Explained
- Mastering Python’s strip() Method: Comprehensive Guide & Practical Examples
- Python Counter in collections – Efficient Counting, Updating, and Arithmetic Operations
- Creating ZIP Archives in Python: From Full Directory to Custom File Selection
- Master Python Unit Testing with PyUnit: A Practical Guide & Example
- Python List index() – How to Find Element Positions with Practical Examples
- Python Calendar Module: Expert Guide with Code Examples
- Master Python Multithreading: GIL Explained with Practical Examples
- Master Python Attrs: Build Advanced Data Classes with Practical Examples