SoFunction
Updated on 2025-04-11

Super complete js regular expressions sorting notes

var reCat = new RegExp("cat", "gi"); //RegExp constructor can take one or two parameters. The first parameter describes the pattern string that needs to be matched, and the second parameter specifies an additional processing commandvar reCat = /cat/gi; //Use Perl style syntax 
 i:Perform case-insensitive matching 
 g:Perform global matching(Find all matches instead of stopping after the first match is found) 
 m:Perform multi-line matching 


Metacharacter

Metacharacters are part of expression syntax. All metacharacters used in regular expressions are: { [ ( \ ^ $ | ) ] } ? * + -
If matched with a question mark: var reQMark = /\?/; or var reQMark = new RegExp("\\?"); //Note that there are two backslashes, double escapes

\xxx Find octal numbers xxx The specified characters,like:/\142/for charactersb 
\xdd Find a hexadecimal number dd The specified characters,like:/\x62/for charactersb 
\uxxxx Find a hexadecimal number xxxx Prescribed Unicode character,like:/\u0062/for charactersb 
\r Find the carriage return character 
\n Find newline characters 
\f Find page breakers 
\t Find tab characters 
\v Find vertical tab characters 
\a Findalertcharacter 
\e Findescapecharacter 
\cX Find与X相对应的控制character 
\0 Find NULL character 
 
. Find单个character,In addition to line breaks and line ending characters,Equivalent to[^\n\r] 
\w Find单词character,Equivalent to[a-zA-Z_0-9] 
\W Find非单词character,Equivalent to[^a-zA-Z_0-9] 
\d Find数字,Equivalent to[0-9] 
\D Find非数字character,Equivalent to[^0-9] 
\s Find空白character,Equivalent to[ \t\n\x0B\f\r],\x0BTo be verticaltaband\tSame 
\S Find非空白character,Equivalent to[^ \t\n\x0B\f\r] 

Square brackets

[abc] Find any character between square brackets 
[^abc] Find any characters that are not between square brackets 
[0-9] Find any from 0 to 9 Numbers 
[a-z] Find any from小写 a to lowercase z Characters of 
[A-Z] Find any from大写 A To capitalize Z Characters of 
[A-z] Find any from大写 A to lowercase z Characters of 
[adgk] Find any character in a given collection 
[^adgk] Find any character outside the given collection 

Quantitative Word
? Match any string containing zero or one, such as: ba?d matches bd, bad
+ Match any string containing at least one, such as: ba+d matches bad, baad
* Match any string containing zero or more, such as: ba*d matches bd, bad, baad
{n} matches a string containing a sequence that happens exactly n times, such as: ba{1}d matches bad
{n,m} matches a string containing at least n times but not more than m times, such as: ba{0,1}d matches bd, bad
{n,} matches a string containing sequences that occur at least n times, such as: ba{0,} matches bd, bad, baad, baaad, baaad
Greedy quantifier:First, see if the entire string matches. If you find that there is no match, last year, try the last character in the string and try again, such as: ?, +, *, {n}, {n, m}, {n, }, the default is the greedy quantifier.
Lazy quantifiers: First check whether the first letter in the string matches. If this character alone is not enough, read the next character and form a string of two characters, which is exactly the opposite of how greedy quantifiers work, such as: ??, +?, *?, {n}?, {n, m}?, {n, }?
Dominant quantifiers: only try to match the entire string. If the entire string cannot produce a match, no further attempts are made, such as: ?+, ++, *+, {n}+, {n, m}+, {n, }+

var sToMatch = "abbbaabbbaaabbb1234"; 
var re1 = /.*bbb/g; //The matching result is "abbbabbaabbaaabbb"var re2 = /.*?bbb/g; //Only lazy quantifiers can match successfully, and the matching result is "abbb", "aabbb", "aaabbb"var re3 = /.*+bbb/g; //If it cannot match, it will be reported directly by an error

Grouping of complex patterns:Used by enclosing a series of characters, character classes and quantifiers through a series of brackets.
/(dog){2}/  Match "dogdog"
/([bd]ad?)*/  Match empty, "ba", "da", "bad", "dad"
/(mom( and dad)?)/  Match "mom", "mom and dad"
/^\s*(.*?)\s+$/  To match the beginning and end of the white space characters, you can also use /^\s+|\s+$/g
Backreferences to complex patterns:Also known as capture grouping, created and numbered in the order of left bracket characters encountered from left to right. For example, the expression (A?(B?(C?))) will produce three backreferences of numbers from 1-3: (A?(B?(C?))), (B?(C?)), (C?)
There are several different ways to use backreferences:
First, after using the test(), match() or search() methods of the regular expression object, the backreference value can be obtained from the RegExp constructor, such as:

var sToMatch = "#123456789"; 
var reNumbers = /#(\d+)/; 
(sToMatch); 
alert(RegExp.$1); //"123456789", $1 saves the first backreference, and you can use $2, $3 in turn...

Then, you can directly include backreferences in the expression that defines the grouping, and implement it by using special escape sequences such as \1, \2, etc.

var sToMatch = "dogdog"; 
var reDogdog = /(dog)\1/; //Equivalent to /dogdog/alert((sToMatch)); //true 

Third, backreferences can be used in the replace() method of the String object, and are implemented by using special character sequences such as $1, $2, etc.

var sToChange = "1234 5678"; 
var reMatch = /(\d{4}) (\d{4})/; 
alert((reMatch, "$2 $1")); //"5678 1234" 

Candidates for complex modes:Use the pipe character (|) to place it between two separate modes

var reBadWords = /badword | anotherbadword/gi; 
var sUserInput = "This is a String using badword1 and badword2."; 
var sFinalText = (reBadWords, function(sMatch){ 
 return (/./g, "*"); //Replace each letter in the sensitive word with an asterisk}); 

Non-capturing grouping of complex patterns:Compared to capture grouping, no backreferences are created. In longer regular expressions, storing backreferences will slow down the matching speed. By using non-capture grouping, you can still have the same ability as matching string sequences without the overhead of storing results.

var sToMatch = "#123456789"; 
var reNumbers = /#(?:\d+)/; // Just add a question mark and a colon following the left bracket to create a non-capturing group(sToMatch); 
alert(RegExp.$1); //"", the output of an empty string is because the grouping is non-capturingalert((reNumbers, "abcd$1")); //The output result is "abcd$1" instead of "abcd123456789", and no backreferences can be used

Another example:

 = function(){ 
 var reTag = /<(?:.|\s)*?>/g; //Match all HTML tags to prevent inserting malicious HTML code return (reTag, ""); 
} 

Complex mode preview:Tell the regular expression operator to look forward to some characters without moving their position, there is a positive lookahead (check whether the next one appears is a certain character set) and a negative lookahead (check the next one that should not appear)
Forward lookahead (?=n) matches any string followed by the specified string n but does not include n. Note that the brackets here are not grouped.
Negative lookahead (?!n) matches any string that is not immediately followed by the specified string n, such as:

var sToMatch1 = "bedroom"; 
var sToMatch2 = "bedding"; 
var reBed1 = /(bed(?=room))/; 
var reBed2 = /(bed(?!room))/; 
alert((sToMatch1)); //true 
alert(RegExt.$1); //Output "bed" instead of "bedroom"alert((sToMatch2)); //false 
alert((sToMatch1)); //false 
alert((sToMatch2)); //true 
alert(RegExt.$1); //The output is also "bed"

The boundaries of complex patterns:Used to represent the position of the pattern in regular expressions
n$ matches any string with the ending n, such as: /(\w+)\.$/ matches the word "one.", "two." at the end of the line, etc.
^n matches any string with the beginning n, such as: /^(.+?)\b/ matches one or more word characters after the starting position.
\b Find matches at the beginning or end of a word, such as: /\b(\S+?)\b/g or /(\w+)/g match to extract words from strings
\B Find matches that are not at the beginning or end of a word
Complex modes of multi-line mode:

var sToMatch = "First second\nthird fourth\nfifth sixth"; 
var reLastWordOnLine = /(\w+)$/gm; 
alert((reLastWordOnLine)); //Output ["second", "fourth", "sixth"] instead of just "sixth"

Properties and methods of RegExp objects:
global  // Whether the RegExp object has the flag g
ignoreCase  // Whether the RegExp object has flags i
multiline  // Whether the RegExp object has flag m
source  //The source text of the regular expression
lastIndex // An integer indicating which character position the next match will start from (it will only be filled in when using the exec() and test() functions, otherwise it is 0)
The one that is really used is lastIndex, such as:

 
var sToMatch = "bbq is short for barbecue"; 
var reB = /b/g; 
(sToMatch); 
alert(); //1, the matching position is 0, lastIndex is 1(sToMatch); 
alert(); //2 
(sToMatch); 
alert(); //18 
 = 0; //Start the match again(sToMatch); 
alert(); //1 instead of 21

Static properties
input, short name $_, is finally used for matching strings (strings passed to exec() or test())
leftContext, short name $^, the substring in front of the last match
rightContext, short name $^, substring after the last match
lastMatch, short name is $&, the last matching character
lastParen, short name is $+, the last matching group
multiline, short name $*, is used to specify whether all expressions use boolean values ​​in multi-line mode. Unlike other properties, it does not depend on the last executed match. It can set the m option of all regular expressions, = "true";, note that IE and Opera do not run it

 var sToMatch = "this has been a short, short summer"; 
 var reShort = /(s)hort/g; 
 (sToMatch); 
 alert(); //"this has been a short, short summer"; 
 alert(); //"this has been a "; 
 alert(); //", short summer"; 
 alert(); //"short" 
 alert(); //"s" 
 
compile() //Compile regular expressionalert(("a cat, a Cat, a cAt caT")); //Return an array, the first entry in the array is the first match, the others are backreferencesalert(("cat")); //true, retrieves the specified value in the string, returning true or false.

Methods to support String objects that are regular expressions

var sToMatch = "a bat, a Cat, a fAt, a faT cat"; 
var reAt = /at/gi; 
alert((reAt)); //Returns an array of matches contained in the stringalert((reAt)); //Output position 3 of the first time that appears in the string, global matching g does not work when search()alert((reAt, "Dog")); //Replace substrings that match regular expressionalert((reAt, function(sMatch){ 
 return "Dog"; 
})); 
alert((/\,/)); //Split the string into a string array

Common modes
Date: /(?:0[1-9]|[12][0-9]|3[01])\/(?:0[1-9]|1[0-2])\/(?:19|20\d{2})/
URL:/^http://([w-]+.)+[w-]+(/[w-./?%&=]*)?$/ 
E-mail address: /^(?:\w+\.?)*\w+@(?:\w+\.?)*\w+$/
Domestic phone number: d{3}-d{8}|d{4}-d{7}
Tencent QQ number: [1-9][0-9]{4,}
Postal code: [1-9]d{5}(?!d)
ID card: d{15}|d{18}
IP address: d+.d+.d+.d+
Chinese characters: [u4e00-u9fa5]
Double-byte characters (including Chinese characters): [^x00-xff]
    =function(){return ([^x00-xff]/g,"aa").length;} 
Full-width characters: /[^uFF00-uFFFF]/g
Match specific numbers:

^[1-9]\d*$    //Match positive integer^-[1-9]\d*$   //Match negative integers^-?[1-9]\d*$ //Match integer^[1-9]\d*|0$  //Match non-negative integers (positive integer + 0)^-[1-9]\d*|0$ //Match non-positive integers (negative integer + 0)^[1-9]\d*\.\d*|0\.\d*[1-9]\d*$ //Match positive floating point number^-([1-9]\d*\.\d*|0\.\d*[1-9]\d*)$  //Match negative floating point numbers^-?([1-9]\d*\.\d*|0\.\d*[1-9]\d*|0?\.0+|0)$  //Match floating point numbers^[1-9]\d*\.\d*|0\.\d*[1-9]\d*|0?\.0+|0$ //Match non-negative floating point numbers (positive floating point numbers + 0)^(-([1-9]\d*\.\d*|0\.\d*[1-9]\d*))|0?\.0+|0$//Match non-positive floating point numbers(Negative floating point number + 0) 

Is it very comprehensive and detailed? If you feel good, just collect this article. It is very important to learn from js regular expressions. Everyone must study hard.