SoFunction
Updated on 2025-03-02

Summary of the application and skills of regular expressions in 20 Python

1. Import the re module

Before you start, first make sure that the re module has been imported:

import re

2. Use the re module for matching

Here is a simple example of how to use the re module to find matches for a specific pattern in a string:

text = "The quick brown fox jumps over the lazy dog"

# Use the re module to find matchesmatches = (r'\b\w{3}\b', text)

print(matches)  # Output matching word list

In the example above, we used\b\w{3}\bThis regular expression matches words of length 3.\bDenote the boundary of the word,\w{3}Indicates matching three alphabetical characters.()The function returns all matching results.

3. Use grouping

Grouping is a powerful feature in regular expressions that allows you to group the matching parts. Here is an example that demonstrates how to use groupings to extract email addresses from text:

text = "Contact us at: support@, sales@"

# Use grouping to extract email addressesemails = (r'([a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,})', text)

print(emails)  # Output the extracted email address list

In the example above,([a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,})is a regular expression that matches the email address. in,()Give the entire email address as a group so that()The function returns only the matching email address part.

4. Replace strings in text

The re module also provides replacement functionality, allowing you to replace specific strings in text using regular expressions. Here is an example that demonstrates how to replace all numbers in text with "X":

text = "There are 123 apples and 456 oranges"

# Use the () function to replace the number in the text with "X"new_text = (r'\d+', 'X', text)

print(new_text)  # Output the replaced text

In the example above,(r'\d+', 'X', text)Use regular expressions\d+Match one or more numbers and replace them with "X".

5. Use compiled regular expressions

Compiling regular expressions can improve matching efficiency when dealing with large amounts of text. Here is an example showing how to match using compiled regular expressions:

pattern = (r'\bpython\b', )

text = "Python is a popular programming language"

# Use compiled regular expressions to matchmatch = (text)

if match:
    print("Found")
else:
    print("Not found")

In the example above,()The function compiles a case-insensitive regular expression and usessearch()Methods are matched.

By mastering the above techniques, you can use the re module in Python to process regular expressions more flexibly and efficiently. Regular expressions are a powerful skill that is very useful when dealing with text and strings.

6. Use predefined character classes

There are some predefined character classes in regular expressions that can simplify matching characters of specific types. Here are some commonly used predefined character classes and their example usage:

  • \d: Match any numerical character.
  • \w: Match any letter, number or underscore character.
  • \s: Match any whitespace characters (spaces, tabs, line breaks, etc.).
text = "The code is 1234 and the password is abcd_123"

# Use predefined character classes to match numbers and letter passwordscodes = (r'\b\w+\b', text)

print(codes)  # Output a list of matching codes and passwords

7. Use quantifiers

Quantifiers are used to specify the number of matching characters or groups. Here are some commonly used quantifiers and their example usage:

  • *: Match the previous character zero or multiple times.
  • +: Match the previous character once or more times.
  • ?: Match the previous character zero or once.
  • {n}: Match the previous character exactly n times.
  • {n,}: Match the previous character at least n times.
  • {n,m}: Match the previous character at least n times, but not more than m times.
text = "The Python programming language is widely used for data analysis"

# Use quantifiers to match words that contain at least two letterswords = (r'\b\w{2,}\b', text)

print(words)  # Output matching word list

8. Use anchor points

Anchors are used to match the boundaries of a string, not the actual characters. Here are some commonly used anchor points and their example usage:

  • ^: Match the beginning of the string.
  • $: Match the end of the string.
  • \b: Match the boundaries of the word.
text = "Python is a great language for both beginners and experts"

# Use anchor points to match sentences that start with Pythonsentence = (r'^Python.*', text)

print(sentence)  # Output matching sentences

9. Greed and non-greedy

In regular expressions, quantifiers are greedy by default, i.e. they match the longest string as possible. But sometimes we want to match the shortest string, and at this time we need to use non-greedy matching. Add the quantifier after?Symbols can turn them into non-greedy matches.

text = "Python is a powerful programming language"

# Use greedy match to find content between "p" and "g"greedy_match = (r'p.*g', text)

# Find content between "p" and "g" using non-greedy matchesnon_greedy_match = (r'p.*?g', text)

print("Greedy Match:", greedy_match)  # Output greedy matching resultsprint("Non-greedy match:", non_greedy_match)  # Output non-greedy matching results

10. Use backward references

Backward references allow you to reference previously matched content in a regular expression. This is very useful when you need to match duplicate patterns.

text = "apple apple orange orange"

# Use backward quotes to match duplicate wordsduplicates = (r'(\b\w+\b) \1', text)

print("Repeated words:", duplicates)  # Output a matching list of repeated words

11. Multi-line matching

Sometimes we need to match multiple lines of text, not just single lines. You can use it at this timeFlags to enable multi-line matching mode.

text = """Python is a popular programming language.
It is used for web development, data analysis, and more.
Python has a simple syntax and is easy to learn."""

# Use multi-line matching pattern to match sentences that start with capital letterssentences = (r'^[A-Z].*$', text, )

print("Sentences that start with capital letters:", sentences)  # Output a list of matching sentences

12. Use named grouping

In complex regular expressions, for added readability and maintenance, named groups can be used to identify matching parts.

text = "John has 5 apples, Mary has 3 oranges"

# Use naming group to extract the name and fruit quantitymatches = (r'(?P<name>\w+) has (?P<quantity>\d+) \w+', text)

for match in matches:
    print("Name:", match['name'], "- Quantity:", match['quantity'])

The above are some advanced tips that can further extend your application and understanding of regular expressions. Through continuous practice and trial, you will be able to apply regular expressions more flexibly to solve various text processing problems.

13. Use pre-search assertions

Pre-search assertions allow you to specify conditions before or after the string when matching a string. It does not consume matching characters and is only used to specify conditions.

text = "apple banana orange grape"

# Use pre-search assertion to match all fruits after "apple"result = (r'(?<=apple\s)(\w+)', text)

print("Fruits containing 'apple':", result)  # Output a list of matched fruits

14. Pre-search assertions with affirmative and negative

Affirmative pre-search assertion(?=...)Match strings that meet the conditions and negate pre-search assertions(?!)Match strings that do not meet the conditions.

text = "Python is a powerful programming language"

# Use a positive pre-search assertion to match words containing "is"positive_result = (r'\b\w+(?= is\b)', text)

# Use negative pre-search assert to match words that do not contain "is"negative_result = (r'\b\w+(?! is\b)', text)

print("Assuming pre-search assertion:", positive_result)  # Output matching word listprint("Negative pre-search assertion:", negative_result)  # Output matching word list

15. Use the () function

()Functions and()The function is similar, but it returns an iterator that can access the matching objects one by one.

text = "Python is a powerful programming language"

# Use the () function to match all wordsmatches_iter = (r'\b\w+\b', text)

for match in matches_iter:
    print(())  # Output matching words

16. Use the () function

In addition to matching and finding text patterns,reThe module also provides()Function, used to split strings according to regular expression patterns.

text = "apple,banana,orange,grape"

# Use the () function to split strings based on commasfruits = (r',', text)

print("Split fruit list:", fruits)  # Output the split fruit list

17. Replace function parameters using () function

()The second parameter of the function can be a function that can be used to process the matching result before replacing it.

def double(match):
    return str(int((0)) * 2)

text = "The numbers are 1, 2, 3, and 4"

# Multiply all numbers by 2 using the replacement function parameternew_text = (r'\d+', double, text)

print("Replaced text:", new_text)  # Output the replaced text

18. Use the () function

()Functions are used to check whether the entire string exactly matches the given pattern.

pattern = (r'\d{4}-\d{2}-\d{2}')

date1 = "2022-01-15"
date2 = "15-01-2022"

# Use the () function to check date formatmatch1 = (date1)
match2 = (date2)

if match1:
    print("The date format is correct")
else:
    print("Date format error")

if match2:
    print("The date format is correct")
else:
    print("Date format error")

19. Use the flag for case-insensitive matching

When compiling regular expressions, you can useFlags to perform case-insensitive matching.

pattern = (r'python', )

text = "Python is a powerful programming language"

# Find "Python" using case-insensitive matching patternmatch = (text)

if match:
    print("Found")
else:
    print("Not found")

20. Use flags to debug regular expressions

When compiling regular expressions, you can useFlags to output debugging information of regular expressions to better understand how it works.

pattern = (r'\b\w{3}\b', )

text = "The quick brown fox jumps over the lazy dog"

# Output compiled regular expression debugging information(text)

By continuing to learn and practice these advanced regular expression techniques, you will be able to better apply regular expressions to handle various text matching and processing tasks, improving the efficiency and maintainability of your code. Regular expressions are one of the powerful and flexible tools in Python that are very useful for handling string patterns.

Summarize

Through this article, we explore the application and techniques of the re module in Python, allowing you to handle regular expressions more flexible and efficiently. We start with the basic pattern matching and introduce how to use the re module to perform matching, grouping, replacing and other operations. We then dig into some advanced techniques, including greedy and non-greedy matching, backward references, multi-line matching, pre-search assertions, and more, which can help you better handle complex text processing tasks. In addition, we also introduced some practical functions and flags, such as()()Replacement function parameters,()etc., to enable you to apply regular expressions more flexibly to solve practical problems.

Mastering regular expressions is a very important part of Python programming. It can help us handle tasks such as string pattern matching and text extraction faster, improving the efficiency and maintainability of our code. Through continuous learning and practice, you will be able to understand and apply regular expressions more deeply, solve various text processing problems, and improve your skills in Python programming. I hope this article will be helpful to you, and you are welcome to continue exploring and learning more about regular expressions.

The above is the detailed content of the application and skills of regular expressions in 20 Python. For more information about Python regular expressions, please follow my other related articles!