SoFunction
Updated on 2025-03-01

Interpretation of common regular expression knowledge points and interpretation of valid numbers and regular expressions of mobile phone numbers and email addresses

1. Regularity is just used to process strings: matching, capturing

Match: Verify that the current string complies with our rules (every regular is a rule)
Capture: In the entire string, all characters that meet the rules are obtained in sequence--->exec, match, replace

2. The composition of regularity: metacharacters, modifiers

Metacharacter:

Metacharacters with special meaning:

\d matches a number of 0-9 equivalent to [0-9], and the opposite is
\D matches any character except 0-9 equivalent to []
\w matches a number or character of 0-9, a-z, A-Z_, which is equivalent to [0-9a-zA-Z_],
\s matches a whitespace character (space, tab...)
\bMatch the boundary of a word "w100 w000"
\tMatch a tab character
\nMatch a line break
. Match any character except \n
^ Start with a meta character
$ ends with a meta character
\ Translate characters
x|y x or y one
[xyz] any one of x, y, z,
[^xyz] Except for any one of x, y, z,
[a-z] -> Match any character in a-z
[^a-z] -> Match any character except a-z
() Grouping in the regular

quantifier:

*  0 to more than
+  1 to multiple
? 0 to 1

? More meanings in regularity

Putting it after a non-quantitle character represents 0-1 occurrence. For example, /^\d?$/The direct number 0-9 appears 0 to 1 occurrence.

Put it behind a quantifier meta character, cancel the greedness during capture /^\d+?$/ When capturing, just get the first captured number "2015"-->2
(?:) Grouped value matching is not captured
(?=) Forward pre-check
(?!) Negative Pre-Check

The role of ()
1) Change the default priority
2) Group capture can be performed
3) Grouped references

{n}appears n times
{n,} appears n to multiple times
{n,m} appears n to m times

Normal metacharacter

In the regular, any character has special meaning, except for the above, the rest are ordinary meta characters that represent their own meaning.

Modifier:

i: Ignore the upper and lower case of letters
m: multiline multiline matching
g: global global matching

Regularities that are often used in projects

1) The rule of judging that it is a valid number

A significant number refers to: positive, negative, zero, decimal

Part 1: Addition and subtraction may occur or no
Part 2: One-digit number can be 0, multi-digit number cannot start with 0
Part 3: There can be decimals or no decimals, but once the decimal point appears, at least one number is followed by
var reg =/^[+-]?(\d|[1-9]\d+)(\.\d+)?$/;

Valid positive integer (including 0): /^[+]?(\d|[1-9]\d+)$/;

Valid negative integer (including 0): /^-(\d|[1-9]\d+)$/;

Judge mobile phone number (simple version):
var  reg=/^1\d{10}$/;

Judgment email
Part 1: Numbers, letters, underscores, - one to multiple digits
Part 2: @
Part 3: Numbers, letters, one to multiple digits
Part 4: (.Two to four digits) .com   .cn   .net   ..       .
var reg =/^[0-9a-zA-Z_-]+@[0-9a-zA-Z-]+(\.[a-zA-Z]{2,4}){1,2}$/

Determine age between 18 and 65
18-19/20-59/60-65
var  reg =/^((18|19)|([2-5]\d)|(6[0-5]))$/

True and valid name of the People's * 2-4 Chinese characters
var reg = /^[\u4e00-\u9fa5]{2,4}$/;

ID number
The top six are province->city->county (district)
Four Year Two Month Two Days

Simple version

    var reg = /^\d{17}(\d|X)$/;
    130828199012040617

Complex version

    var reg = /^(\d{2})(\d{4})(\d{4})(\d{2})(\d{2})(?:\d{2})(\d)(?:\d|X)$/;

Detailed knowledge points

Any character that appears in it represents its own meaning, for example: "." in [.] means a decimal point rather than any character other than \n
18 appears in it is not the number 18 but 1 or 8, for example [18-65] is any one of 1 or 8-6 or 5

1. Exec regular capture method --->Match first, then capture the matching content

If the string does not match this rule, the captured return result is null

If it matches the regular, the result returned is an array

example
var str ="2015zhufeng2016peixun"
var reg = /\d+/;

The first item is what we capture

index: The index position where the captured content starts in the metastring
input: The captured original string

2. Regular capture is lazy

Each capture of the regular starts with the lastIndex value. When the first capture is captured, lastIndex=0, and the capture starts from the position where the original string index is 0. By default, the first capture is completed, the value of lastIndex has not changed, and it is still 0. Therefore, the second capture is still searching from the original string index is 0. In this way, the content of the first capture is still found.
Solution: Add global modifier g--->After adding g, after the first capture is completed, the value of lastIndex changes and becomes the starting index of the first character after the first capture content. The second capture continues to look up backwards...
Question: Is it okay to manually modify the value of lastIndex without using the global modifier g every time the capture is completed?
No, although the lastIndex has been manually modified, the value of lastIndex has indeed changed, but the regular search still starts with index 0.

var str = "zhufeng2015peixun2016";
  var reg = /\d+/g;

example

In order to prevent the dead loop caused by not adding the global modifier g, we manually add a g for those without adding g before processing.

 = function myExecAll() {
    var _this = this, str = arguments[0], ary = [], res = null;
    !_this.global ? _this = eval(_this.toString() + "g") : null;
    res = _this.exec(str);
    while (res) {
      ary[] = res[0];
      res = _this.exec(str);
    }
    return ary;
  };
  var ary = (str);
  (ary);
    ();//->0
    var res = (str);
    (res);
    ();//->11
    res = (str);
    (res);
    ();//->21
    res = (str);
    (res);//->null

3. Match: The capture exists in the string called match that can also be captured. And as long as we cancel the regular laziness, we can capture all the content by executing the match method once.

  var str = "zhufeng2015peixun2016";
  var reg = /\d+/g;
  ((reg));

Question: Then how good is it for us to replace exec with match?

4. Regular grouping capture

Each time you capture, you can not only capture the content that matches the normal, but also capture the content that matches each small group (subregular) separately.

    var str = "zhufeng[2015]peixun[2016]";
    var reg = /\[(\d)(\d+)\]/g;
    var res = (str);
    (res);
    ["[2015]", "2", "015", index: 7, input: "zhufeng[2015]peixun[2016]"]

The first item is the content captured by the Great Regular Res[0]
The second item is the content captured by the first group res[1]
The third item is the content captured by the second group rex[2]
。。。。。

Only match and not capture of the group: If we perform matching the content of the group but not capturing, we only need to add ?: in front of the group.

  var str = "zhufeng[2015]peixun[2016]";
   var reg = /\[(?:\d)(\d+)\]/g;
   var res = (str);
   (res);
   ["[2015]", "015"...]

The first item in the array is the content captured by the Great Regular Res[0]
The second item in the array is the content captured by the second group res[1]
The first group has been added?:, so only matches are not captured

5. The difference between exec and match

Match can only capture the content of the normal match. For group capture, it is impossible to obtain the content of the group match. Therefore, if you do not need to capture the content of the group when capturing, it is more convenient for us to directly use match. If you need to capture the content of the group, we can only use exec to capture one by one.

var str = "zhufeng[2015]peixun[2016]";
  var reg = /\[(\d+)\]/g;
  //((reg));//->["[2015]", "[2016]"]
  var ary = [];
  var res = (str);
  while (res) {
    //(res[1]);
    (RegExp.$1);//RegExp.$1 gets the content captured by the first group of the current regular (maybe no value is captured in some IE browsers)    res = (str);
  }
  (ary);

6. Regular greed: During each capture, it is always captured according to the longest result of regular matching.

var str = "zhufeng2015peixun2016";
    var reg = /\d+/g;
    ((str));//-->["2015","2016"]
  var str = "zhufeng2015peixun2016";
  var reg = /\d+?/g;
  ((str));//-->["2", "0", "1", "5", "2", "0", "1", "6"]

7. Grouped references

\2 means that the content exactly the same as the second group appears

\1 means that the content exactly the same as the first group appears

var reg=/^(\w)(\w)\2\1$/;
  "woow"、"1221"...

8. String method---->replace: replace a character in a string with a new content

1) Without using regularity

Executing replaces only one of the strings, and replacing multiple characters requires multiple executions.

var str = "zhufeng2015 zhufeng2016";
  "zhufeng" -> "Mount Everest"
  str = ("zhufeng", "Mount Everest").replace("zhufeng", "Mount Everest");

Sometimes, even if it is executed multiple times, the replacement cannot be achieved.

  "zhufeng" -> "zhufengpeixun"
  str = ("zhufeng", "zhufengpeixun").replace("zhufeng", "zhufengpeixun");

[The first parameter can be a regular] Replace all contents that match the regular (but like capture, it is lazy by default, only if the global modifier g is added)

    var str = "zhufeng2015 zhufeng2016";
    str = (/zhufeng/g, "zhufengpeixun");
    (str);

1) Issues of execution and execution times

In fact, the principle of exec capture is exactly the same

For example: If our second parameter is passed a function, the current function will be executed once every time the regular captures the current function in the string -> This question captures two times in total, so the function is executed twice.

  var str = "zhufeng2015 zhufeng2016";
  str = (/zhufeng/g, function () {

2) Parameter issues

      (arguments);
Not only does it execute function, but also pass parameters to our function, and the passed parameters are exactly the same as the content captured by each exec.
If it is the first time exec capture->["zhufeng", index:0, input:"raw string"]
The parameters in the first execution of the function
      arguments[0] -> "zhufeng"/**/
arguments[1] -> 0  is equivalent to the index position in exec that begins to capture
arguments[2] -> "Raw String" is equivalent to input in exec

3) Return value problem

What is returned is equivalent to replacing the currently captured content with

   return "zhufengpeixun";
  });
  (str);