SoFunction
Updated on 2025-03-01

Python uses regular expressions to extract special information

1. Delete Python comments in strings

Case:

import re 
time = "2020-01-01 # This is a date"num = (r'#.*$', "", time)# is separated by # and removes the following informationprint("This time is:", num)

result:

This time is: 2020-01-01

2. Intercept the content before and after a symbol

Case 1:

txt = 'My phone number is: 131-246-XXX19'
a = (':')[0]#0 means before the symbolb = (':')[1]#[-1] and [1] result are the same as those after the symbolprint ("The result of a is:",a)
print ("The result of b is:",b)

result:

The result of a is: My phone number is b is: 131-246-XXX19

Case 2:

txt = "I love python . I love python"
text = (r'\..*$', "", txt)# is separated by ., and \ is an escape character, distinguished from the following .print("The result of this interception is:", text)

result:

The result of this interception is: I love python

3. Delete non-numeric strings

Case:

import re 
time = "2020-01-01 # This is a date"num = (r'\D', "", time)
print("This time is:", num)

result:

This time is: 20200101

4. Only keep Chinese

Case:

reg = "[^\u4e00-\u9fa5]"
text = "Okay! E. Let's start learning 34--python!"
print((reg, '', text))

result:

OK Let's start learning

5. Only Chinese, upper and lowercase letters and Arabic numerals are retained

Case:

reg = "[^0-9A-Za-z\u4e00-\u9fa5]"
text = "Okay! E. Let's start learning 34--python!"
print((reg, '', text))

result:

OK E AAA Let's start learning python 34

6. Remove upper and lowercase letters and numbers

Case:

import re
txt="Ayouleyang~Youle-Yang"
text = ('[a-zA-Z0-9]','',txt)
print(text)

result:

Ah~Youle-Yang 1

7. Remove a special character

Case:

import re 
txt = 'A*a# same as $le. :Yang;:youle+'text = ('[,;;youle: :.;,$*#]+', "", txt)#[] will be replaced, similar to replace()print(text)

result:

Ah You Leyang + 1

8. Keep English, numbers and special symbols

Case:

reg = "[^0-9A-Za-z\u4e00-]"# Only keep English, numbers and -, u4e00 "-" are symbols that need to be retainedtxt = "Okay! My number is 131-246-XXX19!::"
text = (reg, '', txt)
print(text)

result:

131-246-XXX19

Summarize

The above is the python introduced to you by the editor. It uses regular expressions to extract special information. I hope it will be helpful to you. If you have any questions, please leave me a message and the editor will reply to you in time. Thank you very much for your support for my website!
If you think this article is helpful to you, please reprint it. Please indicate the source, thank you!