Regular expressions, often abbreviated as regex, are powerful tools for pattern matching and text manipulation in Python. They provide a concise and flexible means to search, match, and replace strings based on specific patterns.
Python's re
module offers comprehensive support for regular expressions. To use regex in Python, first import the module:
import re
The most common regex functions in Python include:
re.search()
: Searches for a pattern within a stringre.match()
: Checks if a pattern matches at the beginning of a stringre.findall()
: Returns all non-overlapping matches of a pattern in a stringre.sub()
: Replaces occurrences of a pattern with a specified stringRegular expressions use special characters to define patterns. Here are some frequently used patterns:
Pattern | Description |
---|---|
. |
Matches any character except newline |
^ |
Matches the start of the string |
$ |
Matches the end of the string |
* |
Matches 0 or more repetitions |
+ |
Matches 1 or more repetitions |
? |
Matches 0 or 1 repetition |
\d |
Matches any digit (0-9) |
\w |
Matches any alphanumeric character |
Let's explore some practical examples of using regular expressions in Python:
import re
email_pattern = r'\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b'
text = "Contact us at info@example.com or support@company.co.uk"
matches = re.findall(email_pattern, text)
print(matches)
# Output: ['info@example.com', 'support@company.co.uk']
This example demonstrates how to use regex to find email addresses within a string. The pattern matches the typical structure of an email address.
import re
text = "Call me at 123-456-7890 or (987) 654-3210"
pattern = r'\(?\d{3}\)?[-.\s]?\d{3}[-.\s]?\d{4}'
masked_text = re.sub(pattern, 'XXX-XXX-XXXX', text)
print(masked_text)
# Output: Call me at XXX-XXX-XXXX or XXX-XXX-XXXX
This example shows how to use re.sub()
to replace phone numbers with a masked version, maintaining privacy in text data.
r
) for regex patterns to avoid escaping backslashesre.compile()
for better performance*
, +
) and use non-greedy versions (*?
, +?
) when appropriateAs you become more comfortable with basic regex, explore advanced concepts such as:
These advanced features can help you create more sophisticated and efficient pattern matching solutions.
Regular expressions are invaluable tools for text processing in Python. They offer a powerful way to search, validate, and manipulate strings based on complex patterns. While the syntax may seem daunting at first, practice and experimentation will help you master this essential skill.
For more advanced string manipulation techniques, consider exploring Python String Manipulation. If you're working with large datasets, you might also find Python List Operations helpful in conjunction with regex.