When ES6 encounters strings and regular expressions

Strings are an important data type, and regular expressions give programmers more ability to manipulate strings. ES6 creators have added many new features to strings and regular expressions. Below is a comprehensive summary in the future.

Two sections of string and regular, the full text will be read for about 10 minutes.

String

1. Better Unicode support

Unicode is a character set. Include all characters in the world in a collection. As long as the computer supports this character set, it can display all characters, and there will be no more garbled code.

Before es6, js strings were built based on 16-bit character encoding. Every 16-bit sequence is an encoding unit representing a character. Unicode0 introduces an extended character set, and the 16-bit character encoding will no longer contain any characters. The coding rules have also been changed.

For UTF-16, the code bits can have multiple encoding units to represent, which is a representation but is not a composition.

For the first 2^16 code bits of UTF-16, all 16-bit encoding unit representations, this range is called the basic multitext plane BMP. When it exceeds, a proxy pair is introduced, which stipulates that two 16-bit encoding units are used to represent a code bit, that is, a 32-bit auxiliary plane character. A 32-bit proxy pair represents a character length of 1, but the length attribute value is 2.

If you want to know more about it, you can refer to Ruan Yifeng's log:/blog/2014/1… The code point mentioned in the log is the code point

1.1codePointAt(0) method

Before es6, the charCodeAt() method returns the value corresponding to each 16-bit encoding unit of the character, and then a codePointAt method is added to es6. codePointAt(0) returns the codepoint at position 0 or the codepoint at position 0, including multiple encoding units>hexadecimal upper limit FFFFF, and the charCodeAt(0) method returns the first encoding unit at position 0.

Therefore, this method can be used to determine the number of encoding units occupied by a character

function is32Bit(c) {
  
  return (0) &gt; 0xFFFF;
}
(is32Bit("auspicious")); //true
(is32Bit("a"));  //false

1.2 () Method

The codePointAt() method retrieves the code bit of a string in the string, and can also be used()Method generates a word based on the specified code bit

((134071)); //Ji

1.3normalize() method

When comparing characters or sorting, equivalents may occur, but there are two situations for equivalents.

The equivalent of the specification is that no matter from which perspective, the code bits of the two sequences are different.
Compatible code bit sequences look different, but can be used interchangeably in certain cases. But it is not equivalent in strict mode unless such equivalent relationship is standardized by some method

The normalize() method provides a standardized form of Unicode, which can accept an optional string parameter. There are four types of Unicode standardized forms

Decompose in standard equivalent, then reorganize in standard equivalent ("NFC"), default value option

Decomposition in a standard equivalent manner ("NFD")
Decomposition in a compatible equivalent manner ("NFKC")
Decompose in a compatible way and then reorganize in a standard equivalent way

1.4 Regular expression u modifier

Adding the u modifier after the regular expression will switch the encoding unit mode to character mode, and the proxy pair at this time will not be considered as two characters.

However, the length property still returns the number of string encoding units, not the number of code bits. But this problem can also be solved by regular expressions with u modifier.

function codePointerLength(text) {
  let result = (/[\s\S]/gu);
  return result ? :0;
}
(codePointerLength("Jiabc")); //4

Check whether the u modifier is supported

The use of the u modifier in ES6-incompatible JavaScript engine will cause syntax errors. You can use the following functions to detect whether it is supported.

function hasRegExpU() {
  try{
    var pattern = new Regexp(".","u");
    return ture;
  }catch (ex) {
    return false;
  }
}

2. Changes to other strings

2.1 String recognition in strings

Developers use the indexOf() method to detect another substring in one string. Provide 3 methods to achieve similar results in es6

The startWith() method returns true when detecting the specified text in the starting part of the string, otherwise it returns false.
incledes() method, if the specified text is detected in the string, returns true, otherwise returns false.
The endWith() method, as the name suggests, is detected at the end, and its usage is consistent with the above.

The above three methods accept two parameters, and the first parameter specifies that the text to be searched is a character. The second one is that the index value of the starting search position is a number. The second parameter endwith is not specified, generally matches from the end of the string. The demonstration is as follows

let mes = "hello world";
(("hello"));
(("!"));
(("o"));
(("o"));
(("d!"));
(("x"));
(("o",4));
(("o",8));
(("o",8));
//The 9 results are: true true true true false true false true false true false(("o",8));Will be from the7The second oneoStart matching。Index value-To search for the length of the text=8-1

2.2 repeat() method

es6 is a new repeat() method added to the string, accepts a parameter of type number, and returns a new string that is repeated this number.

((3)); //"xxx"

I have two dividing lines, hahaha

Regular expressions

1. Changes to other regular expressions

1.1 Regular Expression Y Modifier

The y modifier sticks to the regular expression, starting with the lastIndex property of the regular expression. If the specified position does not match successfully, the match will be stopped and the result will be returned.

let text = 'hello1 hello2 hello3';
let patt = /hello\d\s?/,
  result = (text);
let gPatt = /helllo\d\s?/g,
  gResult = (text);
let yPatt = /hello\d\s?/y,
  yResult = (text);
(resut[0]);  //"hello1 "
(gResut[0]);  //"hello1 "
(yResut[0]);  //"hello1 "
 = 1;
 = 1;
 = 1;
result = (text);
gResult = (text);
yResult = (text);
(resut[0]);  //"hello1 "
(gResut[0]);  //"hello2 "
(yResut[0]);  //Throw an error

Among the three regular expressions here, the first one has no modifier, the second is global modifier g, and the third uses the y modifier.

The first time the match starts with the h character. After lastIndex = 1;, this change is automatically ignored for expressions without modifiers, and the result is still hello1. The g modifier will start to match from the e character, output hello2, yResul will start to match from the e character, ello h is not equal, and the final result is null, so an error will be thrown.

After executing the y modifier, the last bit index value of the last character matched in lastIndex will be saved in lastIndex. If the result of executing the y modifier matching is empty, the lastIndex value will be reset to 0, and the g modifier is the same as this.

The lastIndex attribute will only be designed when the exec() and test() methods of the regular expression object are called. For example, calling the string method natch() will not trigger sticky behavior.

You can use the sticky attribute to detect whether the y modifier exists. If the js engine supports sticky modifiers, the sticky attribute value is true, otherwise it is false

let patt = /hello\d/y;
();

1.2 Copying regular expressions

In es5, you can pass a regular expression as a parameter to the constructor of the regular expression to copy the regular expression. However, when the first parameter is a regular expression, the second parameter cannot be used. This behavior is modified in es6, and the second parameter can be a modifier.

let re1 = /ab/i;
let re2 = new RegExp(re1,"g");
(()); // "/ab/i"
(()); // "/ab/g"

1.3flags attribute

The new flags attribute added by es6 will return all modifiers applied to the current regular expression.

let re = /ab/g;
();  //"ab"
();  //"g"

2. Template literal

2.1 Basic syntax

To summarize in one sentence, the reverse apostrophe (`) is used instead of double quotes and single quotes.

If you want to use an inverse apostrophe in a string, just use \escaping. like

let message = `\`hello\`!`;
(message);

The result is hello!

2.2 Simplified multi-line string

Before es6, multi-line strings were created by array or string splicing. In es6, you only need to wrap the lines directly in the code, and wrap the lines also change the length attribute value. At the same time, all space characters in the reverse apostrophe belong to part of the string.

let message = `Multiline
string`;
(message);
();  //16=6+9+1

2.3 String placeholders

In a template literal, you can embed any legal JavaScript expression into placeholders and output it into the result as part of a string.

Placeholders are usually composed of ${} and can contain any JavaScript expression in the middle. The template literal itself is also a JavaScript expression, so another template literal can be embedded in one template literal.

let name = "sarah";
let message = `my${`name is${name}.`}`;
(message);//my name is sarah.

message is a template literal, which contains name is${name}. This template literal.

Summarize

The above is what the editor introduces to you when ES6 encounters strings and regular expressions. I hope it will be helpful to everyone. If you have any questions, please leave me a message. The editor will reply to everyone in time!