Let’s take a look at what are the operating regular methods in JavaScript.
RegExp
RegExp is the constructor of regular expressions.
There are many ways to create regular expressions using constructors:
new RegExp('abc'); // /abc/ new RegExp('abc', 'gi'); // /abc/gi new RegExp(/abc/gi); // /abc/gi new RegExp(/abc/m, 'gi'); // /abc/gi
It accepts two parameters: the first parameter is a matching pattern, which can be a string or a regular expression; the second parameter is a modifier.
If the regular expression of the first parameter defines a modifier and the second parameter has a value, the modifier defined by the second parameter shall prevail. This is a new feature of ES2015.
The use of constructors is generally used in scenarios where regular expressions are constructed dynamically, and the performance is not as good as literal writing.
Let's take a look at its instance properties:
- lastIndex property. Its function is to mark the position where the next match starts when the global match is globally matched, and it is the handle for global matches.
- source attribute. Its function is to store the body of the regular pattern. For example, abc in /abc/gi.
- Corresponding modifier attributes. Currently there are global , ignoreCase , multiline , sticky , dotAll , unicode properties, which return a boolean value to indicate whether the corresponding modifier is enabled.
- flags attribute. Returns all modifiers.
match
match is a String instance method.
Its function is to return the matching result based on the parameters, and it is also very appropriate to name match.
It accepts a regular expression as a unique parameter.
But how can strings be interpreted as parameters?
'abc'.match('b'); // ["b", index: 1, input: "abc", groups: undefined]
This is because the match method will be called implicitlynew RegExp()
Convert it into a regular instance.
The return value of the match method can be divided into three cases.
Match failed
There is nothing to say, return null.
Non-global matching
Returns an array.
The first item in the array is the matching result. If no argument is passed, the match result is an empty string.
'abc'.match(); // ["", index: 0, input: "abc", groups: undefined]
If there is a capture group in the regular parameter, the captured results are arranged in the array from the second term. Undefined if there is a capture group but no capture content.
'@abc2018'.match(/@([a-z]+)([A-Z]+)?/); // ["@abc", "abc", undefined, index: 0, input: "@abc2018", groups: undefined]
The array has an index attribute indicating the starting position of the matching result in the text.
The array has an input property that displays the source text.
The array has a groups property, which stores not information about capturing the group, but name information.
'@abc2018'.match(/@(?<lowerCase>[a-z]+)(?<upperCase>[A-Z]+)?/); // ["@abc", "abc", undefined, index: 0, input: "@abc2018", groups: { lowerCase: "abc", upperCase: undefined }]
Global Match
Returns an array.
Several results captured are arranged in an array in sequence. Because all matching results are returned, other information, including capture groups and several attributes, cannot be listed.
'abc&mno&xyz'.match(/[a-z]+/g); // ["abc", "mno", "xyz"]
replace
replace is a String instance method.
Its purpose is to replace the given string with the matching result and return the new replaced text. The source text will not change.
It accepts two parameters.
The first parameter can be a string or a regular expression, and its function is to match.
The difference between a parameter being a string and a parameter being a regular expression is that the regular expression has stronger expressive power and can be matched globally. Therefore, if the parameter is a string, it can only be replaced once.
'abc-xyz-abc'.replace('abc', 'biu'); // "biu-xyz-abc" 'abc-xyz-abc'.replace(/abc/, 'biu'); // "biu-xyz-abc" 'abc-xyz-abc'.replace(/abc/g, 'biu'); // "biu-xyz-biu"
The second parameter can be a string or a function, and its function is to replace.
The second parameter is a string
The replace method provides some special variables for the second parameter being a string, which can meet general needs.
$number represents the capture group in the corresponding order. Note that although it is a variable, do not write it as the template string `${$1}biu`. The internal logic of replace will automatically parse the string and extract the variable.
'@abc-xyz-$abc'.replace(/([^-]+)abc/g, '$1biu'); // "@biu-xyz-$biu"
$& represents the matching result.
'@abc-xyz-$abc'.replace(/([^-]+)abc/g, '{$&}');
// {@abc}-xyz-{$abc}
$` represents the text on the left of the matching result.
'@abc-xyz-$abc'.replace(/([^-]+)abc/g, '{$`}');
// {}-xyz-{@abc-xyz-}
$' represents the text on the right of the matching result.
'@abc-xyz-$abc'.replace(/([^-]+)abc/g, "{$'}"); // "{-xyz-$abc}-xyz-{}"
Sometimes what I want is the symbol of the variable itself, not its variable value, what should I do? Add a $ to escape.
'@abc-xyz-$abc'.replace(/([^-]+)abc/g, '$$1biu'); // "$1biu-xyz-$1biu" '@abc-xyz-$abc'.replace(/([^-]+)abc/g, '$biu'); // "$biu-xyz-$biu" '@abc-xyz-$abc'.replace(/([^-]+)abc/g, '$$biu'); // "$biu-xyz-$biu"
In scenarios where there is no misunderstanding, the effects of one $ and two $ are one $ because the other acts as an escape symbol. If there is a misunderstanding, you must add $ to escape.
The second parameter is a function
After all, variables of strings can only be referenced and cannot be operated. In contrast, the function's expressive ability is much stronger.
The return value of the function is what to replace. If the function does not return a value, it returns undefined by default, so the replacement content is undefined.
The first parameter of the function is the matching result.
'abc-xyz-abc'.replace(/abc/g, (match) => `{${match}}`); // "{abc}-xyz-{abc}" 'abc-xyz-abc'.replace(/abc/g, (match) => {}); // "undefined-xyz-undefined"
If there is a capture group, the function's post-order parameters correspond to the capture group one by one.
'@abc3-xyz-$abc5'.replace(/([^-]+)abc(\d+)/g, (match, $1, $2) => `{${$1}${match}${$2}}`);
// {@@abc33}-xyz-{$$abc55}
The penultimate parameter is the position of the matching result in the text.
'@abc-xyz-$abc'.replace(/([^-]+)abc/g, (match, $1, index) => `{${match}Yes the location is${index}}`);
// {@abc is position 0}-xyz-{$abc is position 9}
The penultimate parameter is the source text.
'abc-xyz'.replace(/abc/g, (match, index, string) => `{{${match}}belong{${string}}}`); // "{{abc}belong{abc-xyz}}-xyz"
The most common use of replace method is to escape HTML tags.
'<p>hello regex</p>'.replace(/</g, '<').replace(/>/g, '>'); // "<p>hello regex</p>"
search
search is a String instance method.
Its purpose is to find the index of the first match. It has simpler functions and better performance.
It accepts a regular expression as a unique parameter. Like match, if a non-regular expression is passed in, it calls new RegExp() to convert it into a regular instance.
'abc-xyz-abc'.search(/xyz/); // 4 'abc-xyz-abc'.search(/xyz/g); // 4 'abc-xyz-abc'.search(/mno/); // -1 'abc-xyz-abc'.search(); // 0 'abc-xyz-abc'.search(/abc/); // 0
Since only the first match can be returned, global matches are invalid for it.
If the match fails, return -1.
split
split is a String instance method.
Its purpose is to cut the source text based on the incoming separator. It returns an array of cut units.
It accepts two parameters. The first parameter can be a string or a regular expression, which is a separator; the second parameter is optional, limiting the maximum length of the returned array.
'abc-def_mno+xyz'.split(); // ["abc-def_mno+xyz"] 'abc-def_mno+xyz'.split('-_+'); // ["abc-def_mno+xyz"] 'abc-def_mno+xyz'.split(''); // ["a", "b", "c", "-", "d", "e", "f", "_", "m", "n", "o", "+", "x", "y", "z"] 'abc-def_mno+xyz'.split(/[-_+]/); // ["abc", "def", "mno", "xyz"] 'abc-def_mno+xyz'.split(/[-_+]/g); // ["abc", "def", "mno", "xyz"] 'abc-def_mno+xyz'.split(/[-_+]/, 3); // ["abc", "def", "mno"] 'abc-def_mno+xyz'.split(/[-_+]/, 5); // ["abc", "def", "mno", "xyz"]
If the first parameter passes in an empty string, each string will be cut.
In addition, because the regular in the split method is used to match the separator, global matching has no meaning.
exec
exec is the RegExp instance method.
Its function is to return the matching result based on the parameters, similar to the string method match.
/xyz/.exec('abc-xyz-abc'); // ["xyz", index: 4, input: "abc-xyz-abc", groups: undefined] /mno/.exec('abc-xyz-abc'); // null /xyz/.exec(); // null
The small difference is that the parameter is empty: exec directly returns null; match returns an empty string array. The reason is easy to understand. If there are fish but no nets, the worst is that there is no harvest; if there are nets but no fish, there will be no hope.
The biggest difference between them is the global matching scenario.
Global match means multiple matches. The RegExp instance has a lastIndex attribute. Each time it matches, this attribute will be updated to the location where the next match begins. Exec implements global matching based on this property.
const reg = /abc/g; // 0 ('abc-xyz-abc'); // ["abc", index: 0, input: "abc-xyz-abc", groups: undefined] // 3 ('abc-xyz-abc'); // ["abc", index: 8, input: "abc-xyz-abc", groups: undefined] // 11 ('abc-xyz-abc'); // null // 0 ('abc-xyz-abc'); // ["abc", index: 0, input: "abc-xyz-abc", groups: undefined]
If there are multiple match results, you can get all match results by performing multiple times. Therefore, exec is generally used in loop statements.
There are two points that need special attention:
Because lastIndex will be updated continuously and eventually will be 0, this matching process can be repeated infinitely.
The lastIndex property belongs to a regular instance. Only the lastIndex of the same instance will be constantly updated.
Do you know what the second point means?
/abc/('abc-xyz-abc'); // ["abc", index: 0, input: "abc-xyz-abc", groups: undefined] /abc/('abc-xyz-abc'); // ["abc", index: 0, input: "abc-xyz-abc", groups: undefined] /abc/('abc-xyz-abc'); // ["abc", index: 0, input: "abc-xyz-abc", groups: undefined] // ...
If you don't extract the regular and get its reference, the exec method will keep spinning in place, because every time it is a new regular instance, and every time lastIndex starts from 0.
test
test is the RegExp instance method.
Its purpose is to find out if there is a match for the source text, which is similar to the string method search. Mostly used in form verification.
/abc/.test('abc-xyz-abc'); // true /mno/.test('abc-xyz-abc'); // false /abc/.test(); // false
The difference between test method and search method is mainly reflected in two points:
lastIndex const reg = /abc/g; // 0 ('abc-xyz-abc'); // true // 3 ('abc-xyz-abc'); // true // 11 ('abc-xyz-abc'); // false // 0 ('abc-xyz-abc'); // true
The underlying implementation of modifying the string method
We also see that some methods for processing regularity are defined on the String instance, and some methods for processing regularity are defined on the RegExp instance. In order to unify all the methods that handle regularity on the RegExp instance, ES2015 modified the underlying implementation of some string methods.
Specifically, ES2015 has added four new methods for RegExp instances, and the string methods match, replace, search, and split internal calls have been changed to the corresponding RegExp instance methods.
[] [] [] []
What is it? Symbol is a new basic data type, which has 11 built-in values pointing to methods used internally in the language.
[] Compared with match in terms of use, just flip the caller and parameters.
'abc-mno-xyz'.match(/mno/); // ["mno", index: 4, input: "abc-mno-xyz", groups: undefined] /mno/[]('abc-mno-xyz'); // ["mno", index: 4, input: "abc-mno-xyz", groups: undefined]
Summarize
The above is an article introduced by the editor to understand the methods of JavaScript regular expressions. I hope it will be helpful to everyone. If you have any questions, please leave me a message and the editor will reply to everyone in time. Thank you very much for your support for my website!