1. Requirements
We want to match or look up by specific text patterns.
2. Solution
If you want to match only simple literals, you usually only need to use basic string methods, such as (), (), (), or similar functions.
Example:
text='mark, handsome guy, 18, 183 handsome, mark' print(text=='mark') print(('mark')) print(('mark')) print(('handsome guy'))
result:
False
True
True
6
For more complex matches, regular expressions and re modules are required. To illustrate the basic flow of using regular expressions, suppose we want to match dates formed in numeric form, such as "11/27/2018". Examples are as follows:
import re text1='11/27/2018' text2='Nov 27, 2018' if (r'\d+/\d+/\d+',text1): print('Complied with the model: Numbers/Numbers/Numbers') else: print('Not conforming to the model: Numbers/Numbers/Numbers') if (r'\d+/\d+/\d+',text2): print('Complied with the model: Numbers/Numbers/Numbers') else: print('Not conforming to the model: Numbers/Numbers/Numbers')
Running results:
Comply with the model: Number/Number/Number
Not in line with the model: Numbers/numbers/numbers
If you intend to make multiple matches for the same model, you will usually precompile the regular expression pattern into a pattern object first.
For example:
import re text1='11/27/2018' text2='Nov 27, 2018' datepat=(r'\d+/\d+/\d+') if (text1): print('Complied with the model: Numbers/Numbers/Numbers') else: print('Not conforming to the model: Numbers/Numbers/Numbers') if (text2): print('Complied with the model: Numbers/Numbers/Numbers') else: print('Not conforming to the model: Numbers/Numbers/Numbers')
result:
Comply with the model: Number/Number/Number
Not in line with the model: Numbers/numbers/numbers
The match() method always tries to find a match at the beginning of the string. If you want to search for all matches for the entire text, you should use the findall() method, for example:
import re text='Today is 11/27/2018, yesterday was 11/26/2018' datepat=(r'\d+/\d+/\d+') print((text))
Running results:
['11/27/2018', '11/26/2018']
When defining regular expressions, we often introduce some patterns into the capture group in brackets. The capture group usually simplifies the subsequent processing of matching text, because the content of each group can be extracted separately.Findall()
Search the entire text and find all matches and return them as a list. If you want to find the match in an iterative way, you can usefinditer()
method.
For example:
import re #Join the capture groupdatepat=(r'(\d+)+/(\d+)+/(\d+)') m=('11/27/2018') print((0)) print((1)) print((2)) print((3)) print(()) month,day,year=() print(month) print(day) print(year) print('*'*20) text='Today is 11/27/2018, yesterday was 11/26/2018' for month,day,year in (text): print('{}-{}-{}'.format(year,month,day)) print('*'*20) for m in (text): print(())
result:
11/27/2018
11
27
2018
('11', '27', '2018')
11
27
2018
********************
2018-11-27
2018-11-26
********************
('11', '27', '2018')
('11', '26', '2018')
3. Analysis
This section mainly introduces the basic functions of the re module for text matching and search.()
Compile the pattern and usematch()、findall()、finditer()
This method does matching and search.
When specifying a pattern we usually use raw strings, for example:
r'(\d+)/(\d+)/(\d+)'
Such strings do not escape backslash characters, which is very useful in regular expressions. Otherwise, we need to use double backslashes to identify a separate '', for example:
'(\\d+)/(\\d+)/(\\d+)'
Please note that the match() method only checks the beginning of the character, and the possible match results are not what you want, for example:
import re #Join the capture groupdatepat=(r'(\d+)+/(\d+)+/(\d+)') m=('11/27/2018xxxx') print(m)
result:
< object; span=(0, 10), match='11/27/2018'>
If you want an exact match, you can add an end tag: $
import re #Join the capture groupdatepat=(r'(\d+)+/(\d+)+/(\d+)$') m1=('11/27/2018xxxx') m2=('11/27/2018') print(m1) print(m2)
result:
None
< object; span=(0, 10), match='11/27/2018'>
If you just perform simple text matching and search operations, you can omit the compilation steps.
If you intend to perform many matching or search operations, you usually need to compile the pattern first and then reuse it. Module-level functions will cache recently compiled patterns and save steps.
Summarize
The above is the matching and search of text patterns of regular expressions introduced to you by the editor. I hope it will be helpful to you. If you have any questions, please leave me a message and the editor will reply to you in time. Thank you very much for your support for my website!
If you think this article is helpful to you, please reprint it. Please indicate the source, thank you!