Regex Cheat Sheet - Python

Regular Expressions (Regex) are patterns used in Python for searching, matching, validating, and replacing text. This cheat sheet offers a quick reference to common regex patterns and symbols.

Basic Characters

Expression	Explanations
^	Matches the start of a string (or start of line in MULTILINE mode).
$	Matches the end of a string (or end of line in MULTILINE mode).
.	Matches any character except newline.
a	Matches the character a.
xy	Matches the string xy
a\|b	Matches expression a or b. If a is matched first, b is left untried.

Python

import re

print(re.search(r"^x","xenon"))
print(re.search(r"s$","geeks"))

Output

<re.Match object; span=(0, 1), match='x'>
<re.Match object; span=(4, 5), match='s'>

Explanation:

^x matches x at the start of the string
s$ matches s at the end of the string

Quantifiers

Quantifiers define how many times a pattern should occur

Expressions	Explanations
+	Matches 1 or more occurrences of the preceding expression.
*	Matches 0 or more occurrences.
?	Matches 0 or 1 occurrence.
{p}	Matches the expression to its left p times, and not less.
{p, q}	Matches the expression to its left p to q times, and not less.
{p, }	Matches the expression to its left p or more times.
{0, q}	Matches the expression to its left up to q times

Python

import re

print(re.search(r"9+","289908"))
print(re.search(r"\d{3}","hello1234"))

Output

<re.Match object; span=(2, 4), match='99'>
<re.Match object; span=(5, 8), match='123'>

Explanation:

9+ matches consecutive 9s -> 99
\d{3} matches exactly three digits -> 123

Character Classes

Character Classes define a set of characters to match any single character from that set in a string.

Expressions	Explanations
\w	Matches alphanumeric characters, that is a-z, A-Z, 0-9, and underscore(_)
\W	Matches non-alphanumeric characters, that is except a-z, A-Z, 0-9 and _
\d	Matches digits, from 0-9.
\D	Matches any non-digits.
\s	Matches whitespace characters, which also include the \t, \n, \r, and space characters.
\S	Matches non-whitespace characters.
\A	Matches the expression to its right at the absolute start of a string whether in single or multi-line mode.
\Z	Matches the expression to its left at the absolute end of a string whether in single or multi-line mode.
\n	Matches a newline character
\t	Matches tab character
\b	Matches the word boundary (or empty string) at the start and end of a word.
\B	Matches where \b does not, that is, non-word boundary

Python

import re

print(re.search(r"\s","xenon is a gas"))
print(re.search(r"\D+\d*","123geeks123"))

Output

<re.Match object; span=(5, 6), match=' '>
<re.Match object; span=(3, 11), match='geeks123'>

Explanation:

\s matches the first space
\D+\d* matches non-digits followed by digits -> geeks123

Sets

Sets match one character from a group.

Expressions	Explanations
[abc]	Matches either a, b, or c. It does not match abc.
[a-z]	Matches any alphabet from a to z.
[A-Z]	Matches any alphabets in capital from A to Z
[a\-p]	Matches a, -, or p. It matches - because \ escapes it.
[-z]	Matches - or z
[a-z0-9]	Matches characters from a to z or from 0 to 9.
[(+*)]	Special characters become literal inside a set, so this matches (, +, *, or )
[^ab5]	Adding ^ excludes any character in the set. Here, it matches characters that are not a, b, or 5.
\[a\]	Matches [a] because both square brackets [ ] are escaped

Python

import re

print(re.search(r"[^abc]","abcde"))
print(re.search(r"[a-p]","xenon"))

Output

<re.Match object; span=(3, 4), match='d'>
<re.Match object; span=(1, 2), match='e'>

Explanation:

[^abc] matches d
[a-p] matches e

Groups

Groups allow you to capture parts of a match.

Expressions	Explanations
( )	Matches the expression inside the parentheses and groups it which we can capture as required
(?#...)	Read a comment
(?P<name>pattern)	Matches the expression AB, which can be retrieved with the group name.
(?:A)	Matches the expression as represented by A, but cannot be retrieved afterwards
(?P=group)	Matches the expression matched by an earlier group named “group”

Python

import re

example = (re.search(r"(?:AB)","ACABC"))
print(example)
print(example.groups())

result = re.search(r"(\w*), (\w*)","geeks, best")
print(result.groups())

Output

<re.Match object; span=(2, 4), match='AB'>
()
('geeks', 'best')

Explanation:

re.search(r"(?:AB)", "ACABC"): Finds AB using a non-capturing group, so nothing is stored.
example.groups(): Returns () because non-capturing groups don’t save matches.
re.search(r"(\w*), (\w*)", "geeks, best"): Uses capturing groups to extract words before and after the comma.
result.groups(): Returns ('geeks', 'best').

Assertions

Assertions are regex patterns that match a position in a string without consuming any characters.

Expression	Explanation
A(?=B)	This matches the expression A only if it is followed by B. (Positive look ahead assertion)
A(?!B)	This matches the expression A only if it is not followed by B. (Negative look ahead assertion)
(?<=B)A	This matches the expression A only if B is immediate to its left. (Positive look behind assertion)
(?<!B)A	This matches the expression A only if B is not immediately to its left. (Negative look behind assertion)
(?()\|)	If else conditional

Python

import re

print(re.search(r"z(?=a)", "pizza"))
print(re.search(r"z(?!a)", "pizza"))

Output:

<re.Match object; span=(3, 4), match='z'>
<re.Match object; span=(2, 3), match='z'>

Explanation:

re.search(r"z(?=a)", "pizza"): Positive lookahead; matches z only if followed by a.
re.search(r"z(?!a)", "pizza"): Negative lookahead; matches z only if not followed by a.

Flags

Flags modify regex behavior, such as ignoring case or allowing multiline matching.

Expression	Explanation
a	Matches ASCII only
i	Ignore case
L	Locale character classes
m	^ and $ match start and end of the line (Multi-line)
s	Matches everything including newline as well
u	Matches Unicode character classes
x	Allow spaces and comments (Verbose)

Python

import re

exp = """hello there
I am from
Geeks for Geeks"""

print(re.search(r"and", "Sun And Moon", flags=re.IGNORECASE))
print(re.findall(r"^\w", exp, flags = re.MULTILINE))

Output

<re.Match object; span=(4, 7), match='And'>
['h', 'I', 'G']

Explanation:

re.search(r"and", "Sun And Moon", flags=re.IGNORECASE): IGNORECASE matches "and" ignoring case.
re.findall(r"^\w", exp, flags=re.MULTILINE): MULTILINE matches start of each line; returns ['h', 'I', 'G'].

Python RegEx
Pattern matching in Python with Regex

Regex Cheat Sheet - Python

Basic Characters

^

$

.

a

xy

a|b

Quantifiers

+

*

?

{p}

{p, q}

{p, }

{0, q}

Character Classes

\w

\W

\d

\D

\s

\S

\A

\Z

\n

\t

\b

\B

Sets

[abc]

[a-z]

[A-Z]

[a\-p]

[-z]

[a-z0-9]

[(+*)]

[^ab5]

\[a\]

Groups

( )

(?#...)

(?P<name>pattern)

(?:A)

(?P=group)

Assertions

A(?=B)

A(?!B)

(?<=B)A

(?<!B)A

(?()|)

Flags

a

i

L

m

s

u

x

Related Articles:

Explore