Regular Expressions Overview
Regular expressions are defined rules that Linux tools can use to filter text.
Basic regular expressions
Plain text
[root@node1 ~]# echo "this is a cat" | sed -n '/cat/p' this is a cat [root@node1 ~]# echo "this is a cat" | gawk '/cat/{print $0}' this is a cat
The matching of regular expressions is very picky, especially remember that regular expressions are case sensitive.
Special characters
Special characters recognized by regular expressions include:
.*[]^${}\+?|()
If you want to use a special character as a text character, you must escape it, generally using (\) to escape it.
[root@node1 ~]# echo "this is a $" | sed -n '/\$/p' this is a $
Anchor characters
There are two special characters that can be used to lock the pattern at the beginning or end of the line of the data stream.
The de-character (^) defines the pattern starting from the beginning of the text line in the data stream.
The dollar sign ($) defines the end of the line anchor point.
[root@node1 ~]# echo "this is a cat" | sed -n '/^this/p' this is a cat [root@node1 ~]# echo "this is a cat" | sed -n '/cat$/p' this is a cat
In some cases, these two commands can be used in combination
1. For example, look for lines that only contain specific text
[root@node1 ljy]# more this is a dog what how this is a cat is a dog [root@node1 ljy]# sed -n '/^is a dog$/p' is a dog [root@node
2. Combining two anchor points can directly filter blank lines
[root@node1 ljy]# more this is a dog what how this is a cat is a dog [root@node1 ljy]# sed '/^$/d' this is a dog what how this is a cat is a dog
Dot character
The dot number is used to match any single character except the newline character, and it must match one character.
[root@node1 ljy]# more this is a dog what how this is a cat is a dog at [root@node1 ljy]# sed -n '/.at/p' what this is a cat
Character Group
Define the specific characters to be matched and use character groups. Use square brackets to define a character group.
[root@node1 ljy]# more this is a dog this is a Dog this is a DoG this is a cat [root@node1 ljy]# sed -n '/[dD]og/p' this is a dog this is a Dog [root@node1 ljy]# sed -n '/[dD]o[gG]/p' this is a dog this is a Dog this is a DoG
Exclude character groups
To exclude certain elements, add a de-character before the character group.
[root@node1 ljy]# sed -n '/[dD]o[gG]/p' this is a dog this is a Dog this is a DoG [root@node1 ljy]# sed -n '/[^D]og/p' this is a dog
Range
Regular expressions will include any character within this interval.
[root@node1 ljy]# more 123123 1231 121222222 412345341613 vsdvs qwer12344123 12345 34211 444444 [root@node1 ljy]# sed -n '/^[0-9][0-9][0-9][0-9][0-9]$/p' 12345 34211
Expand regular expressions
question mark
The question mark indicates that the previous character appears 0 or 1 time, only this is the case.
[root@node1 ljy]# echo "bat" | gawk '/ba?t/{print $0}' bat [root@node1 ljy]# echo "baat" | gawk '/ba?t/{print $0}' [root@node1 ljy]# echo "bt" | gawk '/ba?t/{print $0}' bt
You can use question marks and character groups together
[root@node1 ljy]# echo "bt" | gawk '/b[ae]?t/{print $0}' bt [root@node1 ljy]# echo "bat" | gawk '/b[ae]?t/{print $0}' bat [root@node1 ljy]# echo "bet" | gawk '/b[ae]?t/{print $0}' bet [root@node1 ljy]# echo "baat" | gawk '/b[ae]?t/{print $0}'
Add a sign
The plus sign indicates that the preceding character can appear once or more times, but at least once.
[root@node1 ljy]# echo "baat" | gawk '/b[ae]+t/{print $0}' baat [root@node1 ljy]# echo "bt" | gawk '/b[ae]+t/{print $0}' [root@node1 ljy]# echo "bt" | gawk '/ba+t/{print $0}' [root@node1 ljy]# echo "bat" | gawk '/ba+t/{print $0}' bat [root@node1 ljy]# echo "baat" | gawk '/ba+t/{print $0}' baat
Curly braces
Curly braces in ERE allow you to specify upper and lower limits for repeatable regular expressions.
m,n appears at least m, and at most n times.
[root@node1 ljy]# echo "baat" | gawk '/b[ae]{1,2}t/{print $0}' baat [root@node1 ljy]# echo "baaat" | gawk '/b[ae]{1,2}t/{print $0}'
Pipe symbols
Specify regular expression rules in a logical or way, and one of the conditions meets the requirements.
Expression grouping
Regular expression grouping can also be grouped in parentheses.
[root@node1 ljy]# echo "bat" | gawk '/b(a|e)t/{print $0}' bat [root@node1 ljy]# echo "baat" | gawk '/b(a|e)t/{print $0}' [root@node1 ljy]# echo "bet" | gawk '/b(a|e)t/{print $0}' bet
Summarize
The above is the entire content of this article. I hope that the content of this article has certain reference value for your study or work. Thank you for your support.