SoFunction
Updated on 2025-04-06

Detailed explanation of the application of regular expressions in Linux

1. Composition

Normal characters: Normal string, no special meaning
Special characters: have special meaning in regular expressions
Common meta characters in regular expressions [special characters]

2. Meta characters in POSIX BRE [Basic] and ERE [Extended]

\ : Usually used to turn on or off the special meaning of subsequent characters, such as (...) [\ is an escape character, removing the special meaning of the symbol, (), {}, etc. have special meanings in the shell]
The difference between . and .

[root@localhost ~]# cat -n
     1  gd
     2  god
     3
     4  good
     5  goood
     6  goad
     7
     8  gboad

2.1. : Match any single character (except null, that is, it cannot be empty)

[root@localhost ~]# grep -n "."       
1:gd
2:god
4:good
5:goood
6:goad
8:gboad
[root@localhost ~]# grep -n ""
4:good
6:goad

2.2.: Match any time before its characters, such as o, it can be without o or one o, or multiple o

[root@localhost ~]# grep -n "*"
[root@localhost ~]# grep -n "o*"
1:gd
2:god
3:
4:good
5:goood
6:goad
7:
8:gboad
[root@localhost ~]# echo "gbad" >>
[root@localhost ~]# echo "pbad" >>
[root@localhost ~]# echo "kgbad" >>
[root@localhost ~]# echo "poad" >>  
[root@localhost ~]# grep -n "go*" [o can be not, o must match the g in front of o]
1:gd
2:god
4:good
5:goood
6:goad
8:gboad
9:gbad
11:kgbad

*2.3. : Match any character (match all), can be empty**

[root@localhost ~]# grep -n ".*"
1:gd
2:god
3:
4:good
5:goood
6:goad
7:
8:gboad
9:gbad
10:pbad
11:kgbad
12:poad
[root@localhost ~]# grep -n "go.*"
2:god
4:good
5:goood
6:goad
[root@localhost ~]# grep -n "po.*"  
12:poad
[root@localhost ~]# echo "pgoad" >>   
[root@localhost ~]# grep -n "go.*"   [Any character exists after matching go, can be empty]
2:god
4:good
5:goood
6:goad
13:pgoad
[root@localhost ~]#
[root@localhost ~]# grep -n "o.*"  
2:god
4:good
5:goood
6:goad
8:gboad
12:poad

2.4.^: Match the regular expression immediately afterwards, starting with...

[root@localhost tmp]# grep "^root" /etc/passwd
root:x:0:0:root:/root:/bin/bash
[root@localhost tmp]#

2.5. $: Match the regular expression immediately following the previous one and end with...

[root@localhost tmp]# grep "bash$" /etc/passwd | head -1
root:x:0:0:root:/root:/bin/bash
[root@localhost tmp]#
^$: means empty line
"#|^$": Match comment lines and blank lines starting with # sign

2.6. []: Match any character in square brackets

(such as [sS], match s or match S), where a hyphen (-) can be used to specify the range of the hyphen (such as [(0-9)], match any character 0-9); [^0-9] If the ^ symbol appears at the first position of the square bracket, it means that any character not in the list is matched.

[root@localhost tmp]# cat hosts
192.168.200.1
192.168.200.3
.123.5
.56.1
1456.1.2.4
12.4.5.6.8
[root@localhost tmp]# grep -E '([0-9]{1,3}\.){3}[0-9]{1,3}' hosts  
192.168.200.1
192.168.200.3
1456.1.2.4
12.4.5.6.8
[root@localhost tmp]# grep -E '^([0-9]{1,3}\.){3}[0-9]{1,3}$' hosts
192.168.200.1
192.168.200.3
[root@localhost tmp]#

2.7. ?: Match zero or multiple times of the previous character

[root@localhost ~]# grep -E "go?d"   
gd
god
[root@localhost ~]#
[root@localhost tmp]# cat test
do
does
doxy
[root@localhost tmp]# grep -E "do(es)?" test
do
does
doxy
[root@localhost tmp]#

3. Characters only found in POSIX BRE (basic regularity)

{n,m}: interval expression, matches the single character before it to reproduce [repeat, the following single character such as https{0,1}, that is, repeat s 0-1 times. {n} refers to match n times; {n,m} refers to match n to m times, {n,} refers to match at least n times, and {,m} matches at most m times. 【\Escape Characters】

4. Characters only found in POSIX ERE (extended regularity)

4.1. {n,m}: The same function as BRE's {n,m}

[root@localhost tmp]# grep -E '^([0-9]{1,3}\.){3}[0-9]{1,3}$' hosts
192.168.200.1
192.168.200.3

4.2. +: Match one or more times of the previous regular expression

[root@localhost ~]# egrep "go+d"
god
good
goood
[root@localhost ~]#

4.3. |: indicates that the relationship between multiple strings [or]

[root@localhost ~]# grep -E "3306|1521" /etc/services
mysql           3306/tcp                        # MySQL
mysql           3306/udp                        # MySQL
ncube-lm        1521/tcp                # nCube License Manager
ncube-lm        1521/udp                # nCube License Manager
[root@localhost ~]#

4.4. ( ): Group filtering, backward reference

Group filtering

[root@localhost ~]# echo "glad" >>
[root@localhost ~]# egrep "(la|oo)"
good
goood
glad

() Backward reference; when the current matching part uses brackets, the content of the first bracket can be output with \1 in the subsequent part; and so on.

 [root@localhost tmp]# ifconfig |sed -rn 's#.*addr:(.*)(B.*)$#\1#gp'
192.168.4.27 

5. Metachars of regular expressions

5.1.\b: Match a word boundary

[root@localhost tmp]# cat test       
do
does
doxy
agdoeg
[root@localhost tmp]# grep "do\b" test
do
[root@localhost tmp]# grep "\bdo" test       
do
does
doxy
[root@localhost tmp]# grep "\bdoes" test         
does
[root@localhost tmp]# grep "\bdo\b" test 
do
[root@localhost tmp]#

5.2. \B: Match non-word boundaries, opposite to \b

[root@localhost tmp]# grep "do\B" test   
does
doxy
agdoeg
[root@localhost tmp]# grep "do\b" test
do
[root@localhost tmp]#

5.3. \d: Match a numeric character, equivalent to [0-9]

5.4. \D: Match a non-numeric character, equivalent to [^0-9]

5.5.\w: Match letters, numbers, and underscores, equivalent to [A-Za-z0-9_]

There are many meta characters, so I won't list them one by one here

Case: Simplified startup

[root@localhost ~]# chkconfig --list| egrep -v "crond|network|rsyslog|sshd|sysstat" | awk '{print "chkconfig",$1,"off"}'|bash