Detailed explanation of JavaScript regular expressions

1. Regular expression creation

There are two ways to create regular expressions in JavaScript:

The first type: directly pass/regular expression/write it out
The second type: create a RegExp object through new RegExp('regular expression')

const re1 = /ABC\-001/;
const re2 = new RegExp('ABC\\-001');
re1; // /ABC\-001/
re2; // /ABC\-001/

Note that if you use the second writing method, because of the escape problem of strings, the two \ of the string are actually one \.

2. Usage mode

2.1 Using simple mode

Simple patterns are composed of the found direct matches. For example, the /abc/ pattern matches in a string, only the character 'abc' appears at the same time and in this order. "Hi, do you know your abc's?" and "The latest airplane designs evolved from slabcraft." will match successfully. In the above two examples, the substring 'abc' is matched. The string "Grab crack" will not be matched because it does not contain any 'abc' substring.

2.2 Using special characters

For example: pattern /abc/ matches a single 'a' followed by zero or more 'b' (meaning that the previous item has zero or more), followed by any character combination of 'c'. In the string "s'scbbbbbbbcdebc" this pattern matches the substring "abbbbbc".

character	meaning
\	The match will follow the following rules: A backslash before a non-special character indicates that the next character is special and cannot be interpreted literally. For example, 'd' that is not preceded '' usually matches lowercase 'd'. If '' is added, this character becomes a character with a special meaning, which means matching a number. A backslash can also escape special characters after it as literals. For example, pattern /a/ means that 0 or more a matches. Instead, the pattern /a/ removes the speciality of '', so that it can match strings like "a". When using new RegExp("pattern") , don't forget to escape \ because \ is also an escape character in the string.
^	Match the beginning of the input, for example, /^A/ does not match 'A' in "an A", but will match 'A' in "An E".
$	Match the end of the input. For example, /t$/ does not match 't' in "eater", but will match 't' in "eat".
*	Match the previous expression 0 or more times. Equivalent to {0,}. For example, /bo*/ will match 'booooo' in "A ghost boooooed"
+	Match the previous expression once or multiple times. Equivalent to {1,}. For example, /a+/ matches 'a' in "candy" and all 'a' in "caaaaaandy".
?	Match the previous expression 0 or 1 time. Equivalent to {0,1}. For example, /e?le?/ matches 'el' in "angel" and 'le' in "angle" and 'l' in "oslo'. Following immediately after any quantifier *, +, ?, or {} will make the quantifier non-greedy (match as few characters as possible), which is exactly the opposite of the default greedy pattern (match as many characters as possible). For example, applying /\d+/ to "123abc" will return "123", and if /\d+?/ is used, it will only match "1".
.	Match any single character except line breaks. For example, /.n/ will match 'an' and 'on' in "nay, an apple is on the tree", but will not match 'nay'.
x	y
{n}	n is a positive integer that matches the previous character and happens exactly n times. For example, /a{2}/ will not match 'a' in "candy", but will match all a in "caandy" and the first two 'a' in "caaandy".
{n,m}	n and m are both integers. Match the previous characters at least n times, up to m times. If the value of n or m is 0, this value is ignored. For example, /a{1, 3}/ does not match any character in "cndy", match a in "candy", match the first two a in "caandy", and also match the first three a in "caaaaaandy". Note that when matching "caaaaaaandy", the matching value is "aaa", even if there are more a in the original string.
[xyz]	A collection of characters. Match any character in square brackets, including escape sequences. You can use dash (-) to specify a range of characters. For special symbols such as dots (.) and asterisks (*) there is no special meaning in a character set. They don't have to escape, but escape also works. For example, [abcd] and [a-d] are the same. They all match 'b' in "brisket" and 'c' in "city". /[a-z.]+/ and /[\w.]+/ match the string "".
[^xyz]	A reverse character set. That is, it matches any character not included in square brackets. You can use dash (-) to specify a range of characters. Any normal character works here.
\b	Match the boundaries of a word. The boundary of a word is the position where a word is not followed by another "character" character or the position where no other "character" character is in front of it. Note that the boundaries of a matching word are not included in the matching content. In other words, the length of the content of the boundary of a matching word is 0. For example: /\bm/ matches 'm' in "moon"; /oo\b/ does not match 'oo' in "moon" because 'oo' is followed by a "word" character 'n'. /oon\b/ matches 'oon' in "moon" because 'oon' is the end part of this string. In this way, he was not followed by a "character".
\d	Match a number. Equivalent to [0-9]. For example, /\d/ or /[0-9]/ matches '2' in "B2 is the suite number."
\D	Match a non-numeric character. Equivalent to [^0-9]. For example, /\D/ or /[^0-9]/ matches 'B' in "B2 is the suite number."
\f	Match a page break (U+000C).
\n	Match a newline character (U+000A).
\r	Match a carriage return (U+000D).
\s	Match a whitespace character, including spaces, tabs, page breaks, and line breaks. Equivalent to [ \f\n\r\t\v\u00a0\u1680\u180e\u2000-\u200a\u2028\u2029\u202f\u205f\u3000\ufeff]. For example, /\s\w*/ matches 'bar' in "foo bar."
\S	Match a non-whitespace character. Equivalent to [^ \f\n\r\t\v\u00a0\u1680\u180e\u2000-\u200a\u2028\u2029\u202f\u205f\u3000\ufeff]. For example, /\S\w*/ matches 'foo' in "foo bar."
\t	Match a horizontal tab character (U+0009).
\w	Match a single character (letter, number, or underscore). Equivalent to [A-Za-z0-9_]. For example, /\w/ matches 'a' in "apple," , '5' in "$5.28," and '3' in "3D."
\W	Match a non-single character.
\n	In a regular expression, it returns the last nth subcapture matching substring (the number of captured is counted in the left bracket).

3. Application

3.1 Slice strings

Slicing strings with regular expressions is more flexible than using fixed characters, the usual slicing code:

'a d   c'.split(' '); // ['a', 'd', '', '', 'c']

The above method cannot recognize continuous spaces, use regular expression instead:

'a b   c'.split(/\s+/); // ['a', 'b', 'c']

No matter how many spaces are, they can be divided normally. Add to ‘,’:

'a,b, c  d'.split(/[\s\,]+/); // ['a', 'b', 'c', 'd']

Join again;:

'a,b;; c  d'.split(/[\s\,\;]+/); // ['a', 'b', 'c', 'd']

Therefore, regular expressions can be used to convert irregular input into correct arrays.

3.2 Grouping

In addition to judging whether a match is a regular expression, the group to be extracted can also be extracted, and the group to be extracted is represented by (). for example:

^(\d{4})-(\d{4,9})$ defines two groups, and the area code and local number can be extracted directly from the matching string:

var re = /^(\d{4})-(\d{4,9})$/;
('0530-12306'); // ['010-12345', '010', '12345']
('0530 12306'); // null

After the exec() method is successful in matching, it returns an array. The first element is the entire string matched by the regular expression. The following string represents the substring that matches successfully.

The exec() method returns null when the match fails.

3.3 Greedy Match

Note that the default regular matching is greedy matching, that is, match as many characters as possible. As follows, match the 0 after the number:

var re = /^(\d+)(0*)$/;
('102300'); // ['102300', '102300', '']

Since \d+ uses greedy matching, the following 0s are directly matched, and the result 0* can only match the empty string.

You must make \d+ use non-greedy matching (that is, as few matches as possible) to match the following 0s and add one? You can make \d+ use non-greedy matching:

var re = /^(\d+?)(0*)$/;
('102300'); // ['102300', '1023', '00']

3.4 Regular Expression Flags

g	Global Search。
i	Case-insensitive search。
m	Multi-line search。
y	implement“viscosity”search,Match starts from the current position of the target string，AvailableyLogo。

3.5 test() method

The test() method is used to detect whether a string matches a pattern. If the string contains matching text, it returns true, otherwise it returns false.

var re = /^(\d{4})-(\d{4,9})$/;
('0530-12321'); // true
('0530-123ab'); // false
('0530 12321'); // false

4. Commonly used rules (reference)

verifyEmailaddress：^\w+[-+.]\w+)*@\w+([-.]\w+)*\.\w+([-.]\w+)*$
verify身份证号（15bit or18digits）：^\d{15}|\d{}18$
Mainland China Mobile Number：1\d{10}
Mainland China landline number：(\d{4}-|\d{3}-)?(\d{8}|\d{7})
Mainland China Postal Code：[1-9]\d{5}
IPaddress：((2[0-4]\d|25[0-5]|[01]?\d\d?)\.){3}(2[0-4]\d|25[0-5]|[01]?\d\d?)
date(Year-moon-day)：(\d{4}|\d{2})-((1[0-2])|(0?[1-9]))-(([12][0-9])|(3[01])|(0?[1-9]))
date(moon/day/Year)：((1[0-2])|(0?[1-9]))/(([12][0-9])|(3[01])|(0?[1-9]))/(\d{4}|\d{2})
verify数字：^[0-9]*$
verifynNumber of bits：^\d{n}$
verify至少ndigits：^\d{n,}$
verifym-nNumber of bits：^\d{m,n}$
verify零和非零开头的数字：^(0|[1-9][0-9]*)$
verify有1-3Positive real number of decimal places：^[0-9]+(.[0-9]{1,3})?$
verify非零的Positive integer：^\+?[1-9][0-9]*$
verify非零的Negative integers：^\-[1-9][0-9]*$
verify非Negative integers（Positive integer + 0） ^\d+$
verify非Positive integer（Negative integers + 0） ^((-\d+)|(0+))$
verify长度为3Characters of：^.{3}$
verify由26个英文字母组成Characters of串：^[A-Za-z]+$
verify由26个大写英文字母组成Characters of串：^[A-Z]+$
verify由26个小写英文字母组成Characters of串：^[a-z]+$
verify由数字和26个英文字母组成Characters of串：^[A-Za-z0-9]+$

Summarize

That’s all for this article. I hope it can help you, and I hope you can pay more attention to more of my content!