Get started learning Python with DataCamp's free Intro to Python tutorial. Learn Data Science by completing interactive coding challenges and watching videos by expert instructors. Start Now!

Previous Tutorial Next Tutorial

Regular Expressions


Regular Expressions (sometimes shortened to regexp, regex, or re) are a tool for matching patterns in text. In Python, we have the re module. The applications for regular expressions are wide-spread, but they are fairly complex, so when contemplating using a regex for a certain task, think about alternatives, and come to regexes as a last resort.

An example regex is r"^(From|To|Cc).*[email protected]" Now for an explanation: the caret ^ matches text at the beginning of a line. The following group, the part with (From|To|Cc) means that the line has to start with one of the words that are separated by the pipe |. That is called the OR operator, and the regex will match if the line starts with any of the words in the group. The .*? means to un-greedily match any number of characters, except the newline \n character. The un-greedy part means to match as few repetitions as possible. The . character means any non-newline character, the * means to repeat 0 or more times, and the ? character makes it un-greedy.

So, the following lines would be matched by that regex: From: [email protected] To: !asp]<,. [email protected]

A complete reference for the re syntax is available at the python docs.

As an example of a "proper" email-matching regex (like the one in the exercise), see this

# Example: import re pattern = re.compile(r"\[(on|off)\]") # Slight optimization print(re.search(pattern, "Mono: Playback 65 [75%] [-16.50dB] [on]")) # Returns a Match object! print(re.search(pattern, "Nada...:-(")) # Doesn't return anything. # End Example # Exercise: make a regular expression that will match an email def test_email(your_pattern): pattern = re.compile(your_pattern) emails = ["[email protected]", "[email protected]", "wha.t.`1an?ug{}[email protected]"] for email in emails: if not re.match(pattern, email): print("You failed to match %s" % (email)) elif not your_pattern: print("Forgot to enter a pattern!") else: print("Pass") pattern = r"" # Your pattern here! test_email(pattern) # Exercise: make a regular expression that will match an email import re def test_email(your_pattern): pattern = re.compile(your_pattern) emails = ["[email protected]", "[email protected]", "wha.t.`1an?ug{}[email protected]"] for email in emails: if not re.match(pattern, email): print("You failed to match %s" % (email)) elif not your_pattern: print("Forgot to enter a pattern!") else: print("Pass") # Your pattern here! pattern = r"\"?([-a-zA-Z0-9.`?{}][email protected]\w+\.\w+)\"?" test_email(pattern) test_output_contains("Pass") success_msg("Great work!")

This site generously supported by DataCamp. DataCamp offers online interactive Python Tutorials for Data Science. Join over a million other learners and get started learning Python for data science today!

Previous Tutorial Next Tutorial
test