SoFunction
Updated on 2025-03-03

Several basic concepts of regularity

I often see articles that talk about regularity, but only talk about methods, but rarely talk about the following basic concepts:

1. Greed: +,*,?,{m,n}, etc. are greedy matches by default, that is, as many matches as possible, also called maximum matches.
If you add ? afterwards, it will be converted into non-greedy matching and requires higher version support

2. Getting: The default is to use (x|y) to get matches. In many cases, it is just a test and does not necessarily require the matching data to be obtained. Especially in nested matching or big data, non-get matching (?:x|y) should be used, which improves efficiency and optimizes the program.

3. Consumption: The default is consumption matching, and it is generally non-consumption matching in pre-check.
For example, 2003-2-8 will become 2003-02-08
If the second match is used with /-(\d)-/ will start from 8, thus only replacing the first 2, error
If /-(\d)(?=-)/ is used, the second match starts from the second - that is, no characters are consumed

4. Pre-check: js is divided into positive pre-check and negative pre-check
As above (?=pattern) is a forward pre-check, match the search string at the beginning of any string matching pattern. Also (?!pattern) is a negative pre-check, matching the search string at the beginning of any string that does not match pattern. Negative pre-checks are sometimes used to expand [^], [^] are just some characters, and ?! can make the entire string.

5. Callback: It is generally used in substitution, that is, it returns unused substitution values ​​based on unused matching content, thereby simplifying the program and requiring higher version support.

6. Quote: \num Reference to the num-th match obtained.
For example, '(.)\1\1' matches AAA type. '(.)(.)\2\1' Match ABBA type.

[Ctrl+A Select all Note:Introducing external Js requires refreshing the page before execution]

Of course there are many more, these are just basic things that need to be mastered.

When encountering regular problems, the following two methods can be generally solved:

1. Classification, that is, list various situations based on all possible situations, such as numbers within 2003
0 0
1-999 [1-9]\d{0,2}
1000-1999 1\d{3}
2000-2003 200[0-3]

So the final match is (0|[1-9]\d{0,2}|1\d{3}|200[0-3])

2. Grouping, that is, dividing the entire sentence into different smallest units, such as ', %, _ appears in double
If it can be grouped, that is, if it is allowed to exist, there is
''
%% If it is an even number greater than 2, it can be subdivided into multiple smaller 2-character units
__
[^'%_] Non-the above characters

So the final match is ^(''|%%|__|[^'%_])*$


--------------------------------------------------------------------------------

replace + function 5.5+ support

[Ctrl+A Select all Note:Introducing external Js requires refreshing the page before execution]