(8) Greed and non-greedy
The engine of regular expressions is greedy, and as long as the pattern allows it will match as many characters as possible. By adding "?" after "repeat description characters" (*,+), the matching pattern can be changed to non-greedy. Please see the following example:
Copy the codeThe code is as follows:
string x = "Live for nothing,die for something";
Regex r1 = new Regex(@".*thing");
if ((x))
{
("match:" + (x).Value);//Output: Live for nothing,die for something
}
Regex r2 = new Regex(@".*?thing");
if ((x))
{
("match:" + (x).Value);//Output: Live for nothing
}
(9) Backtracking and non-backtracking
Use the "(?>…)" method to make a non-tracement declaration. Due to the greedy nature of the regular expression engine, it will be backtracked in some cases to get a match, see the following example:
Copy the codeThe code is as follows:
string x = "Live for nothing,die for something";
Regex r1 = new Regex(@".*thing,");
if ((x))
{
("match:" + (x).Value);//Output: Live for nothing,
}
Regex r2 = new Regex(@"(?>.*)thing,");
if ((x))//Do not match
{
("match:" + (x).Value);
}
//In r1, ".*" will match all the way to the end of the string due to its greedy nature, and then match "thing", but fails when matching ",". At this time, the engine will backtrack and match successfully at "thing,".
//In r2, the entire expression matching fails due to forced non-backtracking.
(10) Forward pre-search and reverse pre-search
Forward pre-search declaration format: positive declaration "(?=…)", negative declaration "(?!...)", the declaration itself is not part of the final matching result, please see the following example:
Copy the codeThe code is as follows:
string x = "1024 used 2048 free";
Regex r1 = new Regex(@"\d{4}(?= used)");
if ((x).Count==1)
{
("r1 match:" + (x).Value);//Output: 1024
}
Regex r2 = new Regex(@"\d{4}(?! used)");
if ((x).Count==1)
{
("r2 match:" + (x).Value); //Output: 2048
}
//The positive declaration in r1 means that it must be guaranteed to be followed by "used" immediately after the four-digit number, and the negative declaration in r2 means that the four-digit number cannot be followed by "used".
Reverse pre-search declaration format: positive declaration "(?<=)", negative declaration "(?<!)", the declaration itself is not part of the final match result, see the following example:
Copy the codeThe code is as follows:
string x = "used:1024 free:2048";
Regex r1 = new Regex(@"(?<=used:)\d{4}");
if ((x).Count==1)
{
("r1 match:" + (x).Value);//Output: 1024
}
Regex r2 = new Regex(@"(?<!used:)\d{4}");
if ((x).Count==1)
{
("r2 match:" + (x).Value);//Output: 2048
}
//The reverse positive declaration in r1 means that "used:" must be followed immediately before the 4-digit number, and the reverse negative declaration in r2 means that the 4-digit number must be followed immediately before the 4-digit number, except for the 'used:'.
(11) Hexadecimal character range
In regular expressions, you can use "\xXX" and "\uXXXX" to represent a character ("X" to represent a hexadecimal number) character range:
\xXX characters with numbers ranging from 0 to 255, such as: spaces can be represented by "\x20".
\uXXXX Any character can be represented by "\u" plus its number of 4-digit hexadecimal numbers. For example, Chinese characters can be represented by "[\u4e00-\u9fa5]".
(12) A relatively complete match for [0,100]
The following is a relatively comprehensive example. For matching [0,100], special considerations include
*00 legal, 00. Legal, 00.00 legal, 001.100 legal
*Empty string is illegal, only the decimal point is illegal, and greater than 100 is illegal
*The value can be suffixed, such as "1.07f" means that the value is a float type (not considered)
Copy the codeThe code is as follows:
Regex r = new Regex(@"^\+?0*(?:100(\.0*)?|(\d{0,2}(?=\.\d)|\d{1,2}(?=($|\.$)))(\.\d*)?)$");
string x = "";
while (true)
{
x = ();
if (x != "exit")
{
if ((x))
{
(x + " succeed!");
}
else
{
(x + " failed!");
}
}
else
{
break;
}
}
(13) Precision matching is sometimes difficult
Some requirements are difficult to achieve precise matching, such as date, Url, Email address, etc. Some of them you even need to study some special documents to write accurate and complete expressions. For this case, you can only settle for the second best to ensure a relatively accurate match. For example, for dates, a short period of time can be considered based on the actual situation of the application system, or for matching like email, only the most common forms can be considered.