SoFunction
Updated on 2025-04-10

How to match a special string in Python regular expression

Python regular expressions match special strings

Match special strings

Match a string of a specific format in a string. In a string, find the special rule first.substring, and then extract the relevant locationsvalue

strings = ['result-2023-08-18-6g1s1ch-DB9909',  
           'result-2023-08-18-4g1s3ch-DB9909',
           'result-2023-08-18-1g4s1ch-DB9909',
           'result-2023-08-18-1g1s1ch-DB9909']

pattern = r'(\d+)([Gg])(\d+)([Ss])(\d+)([Cc][Hh])' 

results = []
for s in strings:
    match = (pattern, s)
    if match:
        print(())
        g = (2)  #Match the content of the second bracket        s = (4)  #Match the content of the 4th bracket        ch = (6) #Match the content of the 6th bracket        string = (1) + g + (3) + s + (5) + ch
        (string)
print(results)


db_pattern = r'([Dd][Bb])(\d+)'

match = (db_pattern, strings[0])
if match:
    print(())
    db = (1)      #Match the content of the second bracket    number = (2)  #Match the content of the 4th bracket    db_number = db + number

Output content

6g1s1ch
4g1s3ch
1g4s1ch
1g1s1ch
['6g1s1ch', '4g1s3ch', '1g4s1ch', '1g1s1ch']
DB9909

Extract special strings

fullDump_pDevice00000286923A19B0_frame000_1g1s1ch.gfxbench_inst2_F535

pDeviceThere may be a string of other numbers and letters afterwards, just need to be intercepted fromframe001The starting string, such as:

frame000_1g1s1ch.gfxbench_inst2_F535
import re

s = "fullDump_pDevice00000286923A19B0_frame000_1g1s1ch.gfxbench_inst2_F535" 

# Match the prefix to remove
prefix_pattern = r'^fullDump_pDevice\d+_'

# Use sub() to remove the matched prefix
result = (prefix_pattern, '', s)

print(result)

The above regular expression cannot be replaced accurately, and the output result is still the original string:

fullDump_pDevice00000286923A19B0_frame000_1g1s1ch.gfxbench_inst2_F535

Then use the following expression:

s = "fullDump_pDevice0000028fd3B19D0_frame000_1g1s1ch.gfxbench_inst2_F535" 
prefix_pattern = r'^fullDump_pDevice(\d+)([A-Za-z0-9]+)_'
new = (prefix_pattern, "", s)
print(new)

Output result:

frame000_1g1s1ch.gfxbench_inst2_F535

Summarize

The above is personal experience. I hope you can give you a reference and I hope you can support me more.