SoFunction
Updated on 2025-04-03

Detailed explanation of the usage of regular expressions in javascript

[1]Definition: Regularity, also known as rules or patterns, is a powerful string matching tool, an object in javascript

[2] Features

[2.1]Greedy, match the longest
[2.2] Lazy, if /g is not set, it will only match the first one

[3]Two ways to write:

[3.1] Perl writing method (using literal form): var expression = /pattern/flags;
. var pattern = /a/i;//Match all instances of 'a' in the string
[3.1.1]Three flags flags
[a]g: represents global mode (global)
[b]i: means case insensitive (ignoreCase)
[c]m: represents multiline mode
[3.2] js writing method (using RegExp constructor): Two parameters: string pattern to match, flag string (optional)
. var pattern = new RegExp('[bc]at','i');
[Note] Both parameters of the RegExp constructor are strings
[3.3] The difference between constructor and literal
[Note] Any expression that can be defined using literal form can be defined using constructor
[3.3.1] The literal writing method does not support variables, it can only be written in the form of a constructor.
[tips] Get the class element (because classname is a variable, you can only use the form of a constructor)

function getByClass(obj,classname){
  var elements = ('*');
  var result = [];
  var pattern = new RegExp( '^|\\s'+ classname + '\\s|$');
  for(var i = 0; i < ; i++){
    if((elements[i].className)){
      (elements[i]);
    }
  }
}

[3.3.2] In ECMAScript3, regular expression literals always share the same RegExp instance, and each new RegExp instance created using the constructor is a new instance

var re = null, i;
for(i = 0; i < 10; i++){
  re = /cat/g;
  ('catastrophe');
}
for(i = 0; i < 10; i++){
  re = new RegExp('cat','g');
  ('catastrophe');
}  

[3.3.3] ECMAScript5 stipulates that using regular expression literals must create a new RegExp instance every time, just like calling the RegExp constructor directly.

[4] Syntax

[Important Things] No extra spaces appear in regular expressions
[4.0]Metacharacter (14): () [] {} \ ^ $ | ? * + .
[Note] Metachars must be escaped, that is, add escape characters with \, and regulars written in new RegExp must be double escaped
[4.1] Escape characters
[4.1.0].number represents any character other than line breaks\n
[4.1.1]\d Number \D Non-number
[4.1.2]\w letters, numbers, underscores \W non-letters, numbers, underscores
[Note] Chinese characters do not belong to \w
[4.1.3]\s Spaces \S Non-spaces
[4.1.4]\b boundary character. If the left or right side of \w is not \w, a boundary character will appear. \B non-boundary character.
[4.1.5]\1 represents the same character as before
[tips] Find out the characters and numbers with the most duplicates

var str = 'aaaaabbbbbdddddaaaaaaaffffffffffffffffffgggggcccccce';
var pattern = /(\w)\1+/g;
var maxLength = 0;
var maxValue = '';
var result = (pattern,function(match,match1,pos,originalText){
  if( > maxLength){
    maxLength = ;
    maxValue = match1;
  }
})
(maxLength,maxValue);//18 "f"

[4.1.6](\w)(\d)\1\2 :\1 represents the value represented by \w at that time, \2 represents the value represented by \d at that time
[Note] Children in regular expressions must be enclosed in brackets, and the order shall be based on the order in which the precedents of brackets appear.
[4.1.7]\tTabs
[4.1.8]\v Vertical tab character
[4.1.9]\uxxxx Find Unicode characters specified in hexadecimal xxxxx
[Note 1][\u4e00-\u9fa5] represents Chinese
[Note 2] The characters in alert() and () are system escape characters
[a]\r return
[b]\n newline
[c]\t table tab character
[d]\b backspace
[tips]alert line breaks cannot be used <br> or <br\>, but should be used \n. Alert is equivalent to system parsing, not browser
('\n\tHello')
[Note 3] Since the parameters of the RegExp constructor are strings, in some cases, double escapes are required for characters. All meta characters must be double escaped, and characters that have been escaped must also be double escaped.

//Literal Mode -> Equivalent String// /\[bc\]at/    "\\[bc\\]at"
// /\.at/     "\\.at"
// /name\/age/     "name\\/age"
// /\d.\d{1,2}/    "\\d.\\d{1,2}"
// /\w\\hello\\123/      "\\w\\\\hello\\\\123"

[4.2] Quantitative Word
[4.2.1]{n}: Match n times
[4.2.2]{n,m}: Match at least n times, up to m times
[4.2.3]{n,}: Match at least n times
[4.2.4]?: Equivalent to {0,1}
[4.2.5]*: Equivalent to {0,}
[4.2.6]+: equivalent to {1,}
[4.3] Position symbol
[4.3.1]^Start symbol
[4.3.2] The end sign is a dollar sign
[4.3.3]?=Absolute forward looking around
[4.3.4]?!Negative forward looking around
[4.4]Control symbol ([]: Candidate |: or ^: Non-: to)
[4.4.1](red|blue|green) Find any specified option
[4.4.2][abc] Find any character between square brackets
[4.4.3][^abc] Find any character not between square brackets
[4.4.4][0-9] Find any number from 0 to 9
[4.4.5][a-z] Find any character from lowercase a to lowercase z
[4.4.6][A-Z] Find any character from capital A to capital Z
[4.4.7][A-z] Find any character from uppercase A to lowercase z
[4.4.8][adgk] Find any character in a given set
[4.4.9][^adgk] Find any character outside the given set
[4.5] Dollar Sign

//$$ $
//$& matches the substring of the entire pattern (same value as)//$` matches the substring before the substring (same value as)//$' matches the substring after the substring (same value as)//$n   Match the nth capture group substring, where n is equal to 0-9.  $1 means the substring matching the first capture group (calculated from the first one)//$nn   Match the substring of the nn-th capture group, where nn is equal to 01-99
('cat,bat,sat,fat'.replace(/(.a)(t)/g,'$0'))//$0,$0,$0,$0  
('cat,bat,sat,fat'.replace(/(.a)(t)/g,'$1'))//ca,ba,sa,fa
('cat,bat,sat,fat'.replace(/(.a)(t)/g,'$2'))//t,t,t,t
('cat,bat,sat,fat'.replace(/(.a)(t)/g,'$3'))//$3,$3,$3,$3  
('cat,bat,sat,fat'.replace(/(.a)(t)/g,'$$'))//$,$,$,$
('cat,bat,sat,fat'.replace(/(.a)(t)/g,'$&'))//cat,bat,sat,fat
('cat,bat,sat,fat'.replace(/(.a)(t)/g,'$`'))//,cat,,cat,bat,,cat,bat,sat,
('cat,bat,sat,fat'.replace(/(.a)(t)/g,"$'"))//,bat,sat,fat,,sat,fat,,fat,

[5] Instance properties: Through the instance attribute, you can obtain various aspects of a regular expression, but it is not very useful, because all of this information is included in the pattern declaration
[5.1]global: Boolean value, indicating whether the g flag is set
[5.2]ignoreCase: Boolean value, indicating whether the i flag is set
[5.3]lastIndex: integer, indicating the character position of the start searching for the next match, starting from 0
[5.4]multiline: Boolean value, indicating whether the flag m is set
[5.5]source: string representation of regular expressions, returned in literal form rather than string pattern passed in constructor

var pattern = new RegExp('\\[bc\\]at','i');
();//false
();//true  
();//false
();//0
();//'\[bc\]at'

[6]Constructor attributes (static attributes):Applicable to all regular expressions in scope and vary based on the most recent regular expression operation performed. What's unique about these properties is that they can be accessed in two ways, namely long attribute names and short attribute names. Most short attribute names are not valid ECMAScript identifiers, so they must be accessed through square bracket syntax.
[6.1] Using these properties, more specific information can be extracted from operations performed by the exec() method or text() method


//Long attribute name      Short attribute name       Description
//input                                                                                                                             �
//lastMatch
//lastParen       $+        The last matched capture group
//leftContext      $`       The text before lastMatch in the input string
//multiline       $*   Boolean, indicating whether all expressions use multi-line mode
//rightContext     $'    Text after lastMarch in the input string

[Note 1] Opera does not support short attribute names
[Note 2] Opera does not support input\lastMatch\lastParen\multiline
[Note 3] IE does not support multiline

var text = 'this has been a short summer';
var pattern = /(.)hort/g;
if((text)){
  ();//'this has been a short summer'
  ();//'this has been a '
  ();//' summer'
  ();//'short'
  ();//'s'
  ();//false
  (RegExp['$_']);//'this has been a short summer'
  (RegExp['$`']);//'this has been a '
  (RegExp["$'"]);//' summer'
  (RegExp['$&']);//'short'
  (RegExp['$+']);//'s'
  (RegExp['$*']);//false  
}

[6.2] There are also up to 9 constructor properties for storing capture groups

//RegExp.$1\RegExp.$2\RegExp.$3... to RegExp.$9 are used to store the first, second, and ninth matching capture groups respectively. These properties are automatically filled when calling the exec() or test() method

var text = 'this has been a short summer';
var pattern = /(..)or(.)/g;
  if((text)){
    (RegExp.$1);//sh
    (RegExp.$2);//t
}

[7] Example method:
[7.1]exec(): designed specifically for capturing groups, accepting a parameter, namely the string to which the pattern is to be applied. Then return an array containing the first match information. Returns null if there is no match. The returned array contains two additional properties: index and input. index means the match is at the position of the string, and input means the string where the regular expression is applied. In an array, the first item is a string that matches the entire pattern, and the other items are a string that matches the capture group in the pattern. If there is no capture group in the pattern, the array only contains one

var text = 'mom and dad and baby and others';
var pattern = /mom( and dad( and baby)?)?/gi;
var matches = (text);
(pattern,matches);
//:20
//matches[0]:'mom and dad and baby'
//matches[1]:' and dad and baby'
//matches[2]:' and baby'
//:0
//:'mom and dad and baby and others'  

[Note 1] For the exec() method, even if the global flag (g) is set in the pattern, it will only return one match at a time
[Note 2] Without setting global flags, calling exec() multiple times on the same string will always return the information of the first match
[Note 3] When setting global flags, each call to exec() will continue to look for new matches in the string
[Note 4] The js implementation of IE8- has a deviation in the lastIndex property. Even in non-global mode, the lastIndex property will change every time.

var text = 'cat,bat,sat,fat';
var pattern1 = /.at/;
var matches = (text);
(pattern1,matches);
//:0
//matches[0]:'cat'
//:0
//:'cat,bat,sat,fat'

matches = (text);  
(pattern1,matches);  
//:0
//matches[0]:'cat'
//:0
//:'cat,bat,sat,fat'

var text = 'cat,bat,sat,fat';
var pattern2 = /.at/g;
var matches = (text);
(pattern2,matches);  
//:3
//matches[0]:'cat'
//:0
//:'cat,bat,sat,fat'

matches = (text);
(pattern2,matches);  
//:7
//matches[0]:'bat'
//:4
//:'cat,bat,sat,fat'  

[tips] Use the exec() method to find out all matching positions and all values

var string = 'j1h342jg24g234j 3g24j1';
var pattern = /\d/g;
var valueArray = [];//valuevar indexArray = [];//Locationvar temp = (string);
while(temp != null){
  (temp[0]);
  ();
  temp = (string);  
}
(valueArray,indexArray);

[7.2]test(): Accepts a string parameter, and returns true if the pattern matches this parameter, otherwise returns false
[Note] It is often used in cases where you just want to know whether the target string matches a pattern, but you don't need to know its text content. It is often used in if statements

var text = '000-00-000';
var pattern = /\d{3}-\d{2}-\d{4}/;
if((text)){
  ('The pattern was matched');
}

[8] Pattern matching method
[8.1]match(): Only accepts one parameter, regular or string, saves the matching content to an array and returns
[Note] When adding global tags, there is no index and input attributes in the return value of the match() method.
[a]Not added/g

var string = 'cat,bat,sat,fat';
var pattern = /.at/;
var matches = (pattern);
(matches,,);//['cat'] 0 'cat,bat,sat,fat' 

[b]Add/g

var string = 'cat,bat,sat,fat';
var pattern = /.at/g;
var matches = (pattern);
(matches,,);//['cat','bat','sat','fat'] undefined undefined

[c]String

var string = 'cat,bat,sat,fat';
var pattern = 'at';
var matches = (pattern);
(matches,,);//['at'] 1 'cat,bat,sat,fat'  

[8.2]search(): Only accepts one parameter, regular or string, returns the position where the matching content first appears in the string, similar to indexOf where the starting position cannot be set, and the return -1 cannot be found
[a] Regular (the effect of adding /g is the same as not adding /g)

var string = 'cat,bat,sat,fat';
var pattern = /.at/;
var pos = (pattern);
(pos);//0

[b]String

var string = 'cat,bat,sat,fat';
var pattern = 'at';
var pos = (pattern);
(pos);//1

[tips] Find all matching locations

function fnAllSearch(str,pattern){
  var pos = (pattern); 
  var length = (pattern)[0].length;
  var index = pos+length;
  var result = [];
  var last = index;
  (pos);
  while(true){
    str = (index);  
    pos = (pattern);
    if(pos === -1){
      break;
  }
  length = (pattern)[0].length;
  index = pos+length;
  (last+pos);
  last += index;  
}
return result;
}  
(fnAllSearch('cat23fbat246565sa3dftf44at',/\d+/));//[3,9,17,22]

[8.3] replace(): receives two parameters: the first parameter is a regular expression or string (the content to be searched), and the second parameter is a string or function (the content to be replaced)
[a] String replacement

var string = 'cat,bat,sat,fat';
var result = ('at','ond');
(result);//'cond,bat,sat,fat'

[b] Regular None/g Replacement

var string = 'cat,bat,sat,fat';
var result = (/at/,'ond');
(result);//'cond,bat,sat,fat'

[c] Regularly with /g replacement

var string = 'cat,bat,sat,fat';
var result = (/at/g,'ond');
(result);//'cond,bond,sond,fond'

[d] Function replacement: In the case of only one match (that is, a string matching the pattern, 3 parameters will be passed to this function: the pattern match, the position of the pattern match in the string, and the original string. In the case where the regular expression defines multiple capture groups, the parameters passed to the function are the pattern match, the first capture group match, the second capture group match... the Nth capture group match, but the last two parameters are still the position of the pattern match in the string and the original string, and this function returns a string.

[tips] Prevent cross-site scripting attacks xss(css)

function htmlEscape(text){
  return (/[&lt;&gt;"&amp;]/g,function(match,pos,originalText){
    switch(match){
      case '&lt;':
        return '&amp;lt;';
      case '&gt;':
        return '&amp;gt;';
      case '&amp;':
        return '&amp;amp;';
      case '\"':
        return '&amp;quot;';
    }
  });
}
(htmlEscape('&lt;p class=\"greeting\"&gt;Hello world!&lt;/p&gt;'));
//&amp;lt;p class=&amp;quot; greeting&amp;quot;&amp;gt;Hello world!&amp;lt;/p&amp;gt;
(htmlEscape('&lt;p class="greeting"&gt;Hello world!&lt;/p&gt;'));
//Same as above

[9] Inherited methods: all return regular expression literals, which have nothing to do with the way regular expressions are created. It should be noted that the string representation of the regular expression returned by toString() and toLocaleString(), while valueOf returns the regular expression object itself
[9.1]toString()
[9.2]toLocaleString()
[9.3]valueOf()

var pattern = new RegExp('\\[bc\\]at','gi');
(()); // '/\[bc\]at/gi'
(()); // '/\[bc\]at/gi'
(()); // /\[bc\]at/gi

[10] Limitations: The following are features that are not supported by ECMAScript regular expressions
[10.1] Match the \A and \Z anchors at the end of the string (but supports matching the end of the beginning of the string with ^ and $)
[10.2] Backward search (but forward search supports)
[10.3] Unity and Intersection Class
[10.4] Atomic group
[10.5]Unicode support (except for single characters)
[10.6] Named capture group (but numbered capture group)
[10.7]S(single single line) and x(free-spacing without interval) matching mode
[10.8] Conditional Match
[10.9] Regular Expression Comments

[11] Common Examples
[11.1] Two methods to find all numbers in a string
[11.1.1] Operation with traditional strings

var str1 = 'j1h342jg24g234j 3g24j1';
var array = [];
var temp = '';
for(var i = 0; i &lt; ; i++){
  var value = parseInt((i));//If you use number, spaces cannot be excluded    if(!isNaN(value)){
      temp += (i);
    }else{
      if(temp != ''){
        (temp);
        temp = '';  
      }
    }
}
if(temp != ''){
  (temp);
  temp = '';  
}
(array);

[11.1.2] Completed with regular expression

var str1 = 'j1h342jg24g234j 3g24j1';
array = (/\d+/g);
(array);

[11.2] Sensitive word filtering (application of replace callback function)

var string = 'FLG is a cult';
var pattern = /FLG|Cult/g;
var result = (pattern,function($0){
  var s = '';
  for(var i = 0; i &lt; $; i++){
    s+= '*';
  }
  return s;
})
(result);

[11.3] Date formatting

var array = ['2015.7.28','2015-7-28','2015/7/28','2015.7-28','2015-7.28','2015/7---28'];
function formatDate(date){
  return (/(\d+)\D+(\d+)\D+(\d+)/,'$1'+'Year'+'$2'+'moon'+'$3'+'day')
}
var result = [];
for(var i = 0 ; i &lt; ; i++){
  (formatDate(array[i]));
}
(result);

[11.4] Get text content in the web page

var str = '<p>refds</p><p>fasdf</p>'
var pattern = /<[^<>]+>/g;
((pattern,''));

[11.5] Compatible writing method for removing the beginning and end spaces of trim() to

var string = ' my name is littlematch ';
((/^\s+|\s+$/,''));

I hope the above description of regular expressions in javascript can be helpful to everyone.