SoFunction
Updated on 2025-03-03

Python Regular Expressions (?=...) and (?<=...) symbols

introduction

I encountered a relatively difficult problem today, so I finally planned to use the regular expression(?=...)and(?<=...)The symbol has started.

text

(?=...)Express whenWhen matching, the match is successful, but does not consume any characters in the string. This is called forward assertion(lookahead assertion). for example,Isaac (?=Asimov)Will matchIsaac, only if it is followed immediatelyAsimov

In the previous introduction, why should I use it?()Symbols and...The meaning of symbols. In fact, the above(?=...)and(?<=...)In...Symbols can be replaced with any symbol. For the sake of convenience, we will...Symbol replacement with\tSymbol.

Example 1

import re

str1 = 'abc\tdefghi\txyz'
print(('(?=\t)', str1))
"""
result:
< object; span=(3, 3), match=''>
"""

You can see that it matches the index value3of\tcharacter, but since it is a forward assertion, that is, searching forward, and our matches(?=\t)There were no characters before, so nothing matched.

Example 2

Our example1The matching content in  has been slightly changed.

import re

str1 = 'abc\tdefghi\txyz'
print(('abc(?=\t)', str1))
"""
result:
< object; span=(0, 3), match='abc'>
"""

You can see that the string is matchedabc

Example 3

(?<=...)Indicates if...The matching content of   appears on the left side of the current position, and then it matches. This is called affirmative rearview assertion(positive lookbehind assertion)。 (?<=abc)defWill be inabcdefA match was found in  , because the backview will fall back3character and check whether the internal expression matches. The internal expression (matched content) must be of a fixed length, which meansabcora|bIt is allowed, buta*anda{3,4}Can't. Note that for regular expressions starting with positive backview assertions, matches generally do not start with the search string.

The above explanation is more abstract, so what does it mean? Let's look at an example.

import re

str1 = 'abc\tdefghi\txyz'
print(('(?<=\t)def', str1))
"""
result:
< object; span=(4, 7), match='def'>
"""

Find the program firstdefString, then back one character to viewdefIs the preceding character a\tCharacters, if so, then matchdefString.

Example 4

What if we want to match the part between two \t characters?

import re

str1 = 'abc\tdefghi\txyz'
print(('\t(.*)\t', str1))
"""
result:
< object; span=(3, 11), match='\tdefghi\t'>
"""

You can see that at this time, the result contains both ends\tCharacter, but we don't want it to contain\tCharacters. To achieve this, we can adopt the rear view and forward view we just mentioned.

import re

str1 = 'abc\tdefghi\txyz'
print(('(?<=\t).*(?=\t)', str1))
"""
result:
< object; span=(4, 10), match='defghi'>
"""

You can see that we matched two\tThe string in the middle of the character, but the above writing method is not rigorous. For rigorousness, we can use:

import re

str1 = 'abc\tdefghi\txyz'
print(('(?<=\t)(.*)(?=\t)', str1))
"""
result:
< object; span=(4, 10), match='defghi'>
"""

This is the end of this article about the use of Python regular expressions (?=...) and (?<=...) symbols. For more related Python ?=... ?<=... For more content, please search for my previous articles or continue browsing the related articles below, I hope you can support me more in the future!