Python Regular Expressions
Learn Python through interactive, bite-sized lessons. Practice with real code challenges and build projects step-by-step.
Start Python Journey →Regular expressions, often abbreviated as regex, are powerful tools for pattern matching and text manipulation in Python. They provide a concise and flexible means to search, match, and replace strings based on specific patterns.
Basic Syntax and Usage
Python's re module offers comprehensive support for regular expressions. To use regex in Python, first import the module:
import re
The most common regex functions in Python include:
re.search(): Searches for a pattern within a stringre.match(): Checks if a pattern matches at the beginning of a stringre.findall(): Returns all non-overlapping matches of a pattern in a stringre.sub(): Replaces occurrences of a pattern with a specified string
Common Regex Patterns
Regular expressions use special characters to define patterns. Here are some frequently used patterns:
| Pattern | Description |
|---|---|
. |
Matches any character except newline |
^ |
Matches the start of the string |
$ |
Matches the end of the string |
* |
Matches 0 or more repetitions |
+ |
Matches 1 or more repetitions |
? |
Matches 0 or 1 repetition |
\d |
Matches any digit (0-9) |
\w |
Matches any alphanumeric character |
Practical Examples
Let's explore some practical examples of using regular expressions in Python:
1. Matching Email Addresses
import re
email_pattern = r'\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b'
text = "Contact us at info@example.com or support@company.co.uk"
matches = re.findall(email_pattern, text)
print(matches)
# Output: ['info@example.com', 'support@company.co.uk']
This example demonstrates how to use regex to find email addresses within a string. The pattern matches the typical structure of an email address.
2. Replacing Phone Numbers
import re
text = "Call me at 123-456-7890 or (987) 654-3210"
pattern = r'\(?\d{3}\)?[-.\s]?\d{3}[-.\s]?\d{4}'
masked_text = re.sub(pattern, 'XXX-XXX-XXXX', text)
print(masked_text)
# Output: Call me at XXX-XXX-XXXX or XXX-XXX-XXXX
This example shows how to use re.sub() to replace phone numbers with a masked version, maintaining privacy in text data.
Best Practices
- Use raw strings (prefixed with
r) for regex patterns to avoid escaping backslashes - Compile frequently used patterns with
re.compile()for better performance - Be cautious with greedy quantifiers (
*,+) and use non-greedy versions (*?,+?) when appropriate - Test your regex patterns thoroughly with various input strings
- Consider using Python Try...Except blocks to handle potential regex-related exceptions
Advanced Concepts
As you become more comfortable with basic regex, explore advanced concepts such as:
- Lookahead and lookbehind assertions
- Named capture groups
- Conditional patterns
- Unicode character properties
These advanced features can help you create more sophisticated and efficient pattern matching solutions.
Conclusion
Regular expressions are invaluable tools for text processing in Python. They offer a powerful way to search, validate, and manipulate strings based on complex patterns. While the syntax may seem daunting at first, practice and experimentation will help you master this essential skill.
For more advanced string manipulation techniques, consider exploring Python String Manipulation. If you're working with large datasets, you might also find Python List Operations helpful in conjunction with regex.