Another regular expression study notes

1. \b: indicates the beginning or end of a word, which may be spaces, punctuation marks or line breaks, but \b does not match any of them, which refers to any position of these elements.
Example: \bhi\b: Find all the words "hi" in the text, but do not include words such as him, history, etc.
1.1 ^: The beginning of a string is specifically the beginning of a paragraph.
1.2 $: Match the end of the string. Specifically refers to the end of a paragraph, both of which are subsets of \b.
repeat:
2. *: means that the previous content of * appears repeatedly any number of times. If ".*" is connected together, it means that any number of characters that do not contain newlines.
Example: \bhi\b.*\bLucy\b: First it is a hi, then any number of characters (but there cannot be a carriage return), and finally a separate word Lucy.
2.1 +: Also represents the quantity, but + must be 1 or more, excluding 0 times, while * refers to any quantity, including 0 repetitions.
2.2 {n}: Quantity control, the characters in front are accurately repeated n times.
2.3 {n,m}: Quantity control, the characters in the front are repeated n to m times, n<=m.
2.4 ?: Repeat 0 or 1 time.
3. .: Indicates any character, does not include carriage return and line breaks.
4. \d: Match any number (0,1,2...9)
Example: 0\d\d-\d{7}: Find a string of starting with 0, the last two are numbers, followed by a hyphen "-", followed by a string of 7 numbers, such as: 025-8224110.
5. \s: Match any whitespace characters, including spaces, tab characters (tab keys), line breaks, Chinese full-width spaces, etc.
6. \w: Match letters, numbers, underscores, etc.
Example 1: \ba\w*\b: Match starts with the letter "a", then any multiple arbitrary characters do not include whitespaces, etc., and then a word ending character. Its meaning is all words that start with a.
Example 2: \b\w{6}\b: Match words that are exactly 6 characters in length.
7. []: Arbitrarily matches the characters that exist in a square bracket.
Example: [abc]\w{4}\b: A word with 4 letters beginning with any character in a, b, and c.
Antonym
8. \D \S \W \B The uppercase forms of these metacharacters represent the antonyms of the set they represent.
Example: \D: All characters that are not numbers, such as: abced
8.1 [^x]: All characters that are not x characters
8.2 [^xyz]: means that a character that is not any of x, y, or z
9. Replacement
"|": Use the "|" symbol to implement logic or operations. In conjunction with the use of brackets "()", it can implement different conditions or operations.
10 groups
"()": Use brackets to enclose the implemented expressions, so that repetition, replacement and other operations can be conveniently continued.
Example: \b(\w+\b\s+)\1+\b: The first bracket expression that appears is represented by \1, which can match go go go
It's already very good to learn this by yourself. Let's continue to study the advanced properties of regular expressions.
assertion:
(?=express) This is an assumption condition that can be placed behind the expression. It has been verified whether the expression after the character in front is express, but does not contain the expression behind it.
Example: \b\w*(?=ing\b): Get the prefix of all words with ing suffix.
(?<=express) The precedence is placed at the head of the expression. It has been verified that the expression in front of the string complies with express, and does not include the expression itself.
Example: (?<=\bre)\w*\b: Get the following parts of all words with prefix re
Notes:
(?#) Comment regular expressions in this form.
Example: 2[0-4]\d(?#200-249)
Lazy pattern matching
*: The most matching characters
*?: The character matching the least