SoFunction
Updated on 2025-04-06

Detailed explanation of grouping matching of JavaScript regular expressions

Grouping

The following regular expression can match kidkidkid:

/kidkidkid/

Another more elegant way of writing is:

/(kid){3}/

A small whole wrapped in parentheses here is called grouping.

Candidates

In a group, there can be multiple candidate expressions separated by |:

var reg = /I love (him|her|it)/;

('I love him')  // true 
('I love her')  // true
('I love it')  // true
('I love them') // false

Here | is equivalent to the meaning of "or".

Capture and reference

The string matched (captured) by the regular expression will be temporarily stored. Where, the strings captured by the packet will be numbered starting from 1, so we can refer to these strings:

var reg = /(\d{4})-(\d{2})-(\d{2})/
var date = '2010-04-12'
(date)

RegExp.$1 // 2010
RegExp.$2 // 04
RegExp.$3 // 12

$1 refers to the first captured string, $2 is the second, and so on.

Cooperate with replace

The captured string can be directly referenced in the parameter of the method. For example, we want to change the date 12.21/2012 to 2012-12-21:

var reg = /(\d{2}).(\d{2})\/(\d{4})/
var date = '12.21/2012'

date = (reg, '$3-$1-$2') // date = 2012-12-21

By the way, passing iterative functions to replace can sometimes solve some problems gracefully.

Converting a banned word to an equal number of asterisks is a common feature. For example, the text is kid is a doubi, where kid and doubi are prohibited words, then it should be *** is a ***** after conversion. We can write this:

var reg = /(kid|doubi)/g
var str = 'kid is a doubi'

str = (reg, function(word){
  return (/./g, '*')
})

Capture of nested groupings

If you encounter nested groups like /((kid) is (a (doubi))/, what is the order of capture? Let's try:

var reg = /((kid) is (a (doubi)))/
var str = "kid is a doubi"

( str ) // true

RegExp.$1 // kid is a doubi
RegExp.$2 // kid
RegExp.$3 // a doubi
RegExp.$4 // doubi

The rule is to capture it in the order in which the left bracket appears.

Backreferences

References can also be made in regular expressions, which is called backreferences:

var reg = /(\w{3}) is \1/

('kid is kid') // true
('dik is dik') // true
('kid is dik') // false
('dik is kid') // false

\1 refers to the first string captured by the group, in other words, the expression is determined dynamically.

Note that if the number goes beyond the bounds, it will be regarded as an ordinary expression:

var reg = /(\w{3}) is \6/;

( 'kid is kid' ); // false
( 'kid is \6' );  // true

Types of grouping

There are four types of grouping:

Capture type - ()
Non-capture type-(?:)
Forward-looking type - (?=)
Reverse forward-looking type - (?!)
What we have mentioned before are capture packets, and only such packets will temporarily store the matching string.

Non-capturing grouping

Sometimes, we just want to divide a group without the need to capture, we can use non-capture grouping, with the syntax immediately following the left bracket?:

var reg = /(?:\d{4})-(\d{2})-(\d{2})/
var date = '2012-12-21'
(date)

RegExp.$1 // 12
RegExp.$2 // 21

In this example, the (?:\d{4}) packet does not capture any strings, so $1 is the string captured by (\d{2}).

Forward and reverse lookahead grouping

It's like you're standing in place and looking forward:

Forward-looking grouping - What is ahead of you?
Negative forward-looking grouping - Isn't you anything ahead of you?
It's too difficult to talk about, I like to call it a positive expression or a negative expression. Let me give you a positive and forward-looking example:

var reg = /kid is a (?=doubi)/

('kid is a doubi') // true
('kid is a shabi') // false

kid is a What follows? If it is doubi, the match will be successful.

Negative forwarding is the opposite:

var reg = /kid is a (?!doubi)/

('kid is a doubi') // false
('kid is a shabi') // true

The value will not be captured if the prospective grouping is also present. So what is the difference between it and non-capture type? See example:

var reg, str = "kid is a doubi"

reg = /(kid is a (?:doubi))/
(str)
RegExp.$1 // kid is a doubi

reg = /(kid is a (?=doubi))/
(str)
RegExp.$1 // kis is a

It can be seen that strings matched by non-capture packets will still be captured by outer capture packets, but prospective ones will not. Prospective grouping comes in handy when you need to refer to the value that follows and don't want to capture it together.

Finally, JS does not support lookahead grouping.