gsub() can be used for deletion, addition, replacement and cutting of fields. It can handle a field or a vector composed of fields.
The specific usage method is: gsub("target character", "replace character", object)
In the gsub function, any field processing is implemented by replacing "replace character" with "target character". Let the replacement character be '''' to delete it, let the replacement character be "target character + supplement content" with supplement, and the replacement and cutting are also used similar operations.
> text <- "AbcdEfgh . Ijkl MNM" > gsub("Efg", "AAA", text) #Change Efg to AAA, case sensitive[1] "AbcdAAAh . Ijkl MNM"
Any symbol, including spaces, tabs and line breaks, can be recognized.
> gsub(" I", "i", text) #Recognizable spaces[1] "AbcdEfgh .ijkl MNM"
At the same time, multiple characters can be recognized and batch substituted
> gsub("M", "N", text) [1] "AbcdEfgh . Ijkl NNN"
In addition, gsub has other batch operations methods
> gsub("^.* ", "a", text) #The beginning is until the end of the last space is replaced with a[1] "aMNM" > gsub("^.* I(j).*$", "\\1", text) #Retain only one j[1] "j" > gsub(" .*$", "b", text) #The first space goes straight to the end and replaces it with b[1] "AbcdEfghb" > gsub("\\.", "\\+", text) #Free. and plus sign + are special, and should be added \\ to identify[1] "AbcdEfgh + Ijkl MNM"
Syntax | Description |
\\d | Digit, 0,1,2 ... 9 |
\\D | Not Digit |
\\s | Space |
\\S | Not Space |
\\w | Word |
\\W | Not Word |
\\t | Tab |
\\n | New line |
^ | Beginning of the string |
$ | End of the string |
\ | Escape special characters, . \\ is "\", \+ is "+" |
| | Alternation match. . /(e|d)n/ matches "en" and "dn" |
• | Any character, except \n or line terminator |
[ab] | a or b |
[^ab] | Any character except a and b |
[0-9] | All Digit |
[A-Z] | All uppercase A to Z letters |
[a-z] | All lowercase a to z letters |
[A-z] | All Uppercase and lowercase a to z letters |
i+ | i at least one time |
i* | i zero or more times |
i? | i zero or 1 time |
i{n} | i occurs n times in sequence |
i{n1,n2} | i occurs n1 - n2 times in sequence |
i{n1,n2}? | non greedy match, see above example |
i{n,} | i occures >= n times |
[:alnum:] | Alphanumeric characters: [:alpha:] and [:digit:] |
[:alpha:] | Alphabetic characters: [:lower:] and [:upper:] |
[:blank:] | Blank characters: . space, tab |
[:cntrl:] | Control characters |
[:digit:] | Digits: 0 1 2 3 4 5 6 7 8 9 |
[:graph:] | Graphical characters: [:alnum:] and [:punct:] |
[:lower:] | Lower-case letters in the current locale |
[:print:] | Printable characters: [:alnum:], [:punct:] and space |
[:punct:] | Punctuation character: ! " # $ % & ' ( ) * + , - . / : ; < = > ? @ [ \ ] ^ _ ` { | } ~ |
[:space:] | Space characters: tab, newline, vertical tab, form feed, carriage return, space |
[:upper:] | Upper-case letters in the current locale |
[:xdigit:] | Hexadecimal digits: 0 1 2 3 4 5 6 7 8 9 A B C D E F a b c d e f |
This is the article about the specific use of the R language gsub replacement character tool. For more related content of the R language gsub replacement character tool, please search for my previous articles or continue browsing the related articles below. I hope everyone will support me in the future!