Master Python Regular Expressions: re.match(), re.search(), re.findall() – Practical Examples

What Is a Regular Expression in Python?

A regular expression (regex) is a compact textual description that tells Python how to locate patterns within a string. Regexes are indispensable for data extraction, validation, and transformation across codebases, logs, configuration files, and web scraping projects.

In Python, the re module ships with the interpreter and exposes a rich API for compiling patterns, searching, and replacing text. Whether you’re filtering user input, parsing logs, or scraping content from websites, a solid grasp of regex fundamentals saves time and reduces bugs.

Core Regex Concepts

Literal characters – match the exact text you write.
Metacharacters – special symbols that describe classes, ranges, or quantifiers (e.g., \d, \w, +, ?, *).
Anchors – ^ and $ pin the pattern to the start or end of a string.
Quantifiers – specify how many times a token may repeat.
Groups & alternation – capture sub‑matches and allow multiple options.
Flags – tweak pattern behavior (case‑insensitivity, multiline, etc.).

Common Identifier Table

Identifier	Description	Whitespace	Special Escape
\d	Any digit (0‑9)	\n	., +, *, ?, [], $, ^, (, ), {}, \|, \
\D	Non‑digit	\s
\s	Space, tab, newline, etc.	\t
\S	Non‑space	\e
\w	Alphanumeric plus underscore	\r
\W	Non‑alphanumeric (excluding underscore)	\f
.	Any character except newline	—
\b	Word boundary	—
\.	Literal period	—
\{x}	Exactly x occurrences	—

Getting Started with `re`

import re

The re module is part of Python’s standard library and requires no external dependencies. Common use‑cases include:

String validation (e.g., email addresses, phone numbers)
Data extraction from unstructured text
Automated web scraping

Example 1: `\w+` and `^`

^ – anchor at the beginning of a string.
\w+ – one or more word characters (letters, digits, underscore).

Using these, we can capture the first word of a string:

import re
sample = "guru99, education is fun"
result = re.findall(r"^\w+", sample)
print(result)  # Output: ['guru99']

Removing the + would return only the first character: ["g"].

Example 2: Splitting on Whitespace with `\s`

The re.split() function can divide a string wherever a pattern matches. Using \s splits on any whitespace character.

import re
text = "we are splitting the words"
print(re.split(r"\s", text))  # ['we', 'are', 'splitting', 'the', 'words']

If the backslash is omitted, s is treated as a literal, resulting in splits at every s character.

Regex Methods Overview

re.match() – checks for a match only at the beginning of the string.
re.search() – scans the entire string and returns the first match.
re.findall() – returns a list of all non‑overlapping matches.

Using `re.match()`

import re
text = "guru99 is a great platform"
match = re.match(r"^\w+", text)
print(match.group() if match else "No match")  # Output: guru99

Using `re.search()`

import re
text = "Software Testing is fun"
for pattern in ["Software testing", "guru99"]:
    found = re.search(pattern, text, re.IGNORECASE)
    print(f"Looking for '{pattern}' in '{text}' - {'found a match!' if found else 'no match'}")

Using `re.findall()`

import re
emails = "guru99@google.com, careerguru99@hotmail.com, users@yahoomail.com"
found = re.findall(r"[\w\.-]+@[\w\.-]+", emails)
for e in found:
    print(e)

Regex Flags

Flags modify pattern behavior. Common flags include:

Flag	Description
re.M (re.MULTILINE)	Make ^ and $ match the start and end of each line.
re.I (re.IGNORECASE)	Ignore case distinctions.
re.S (re.DOTALL)	Make dot match newlines.
re.U (re.UNICODE)	Apply Unicode-aware character classes.
re.L (re.LOCALE)	Make classes depend on the current locale.
re.X (re.VERBOSE)	Allow whitespace and comments in the pattern.

Multiline Example

import re
text = """guru99
careerguru99	selenium"""
print(re.findall(r"^\w", text))          # ['g']
print(re.findall(r"^\w", text, re.MULTILINE))  # ['g', 'c', 's']

Python 2 Compatibility

All examples above are written for Python 3. If you must run them under Python 2, replace print() statements with the Python 2 syntax and ensure that the re module is imported in the same way.