As we all know, in program development, it is inevitable to encounter situations where strings need to be matched, searched, replaced, and judged. These situations are sometimes more complicated. If solved in pure encoding, it will often waste programmers' time and energy. Therefore, learning and using regular expressions have become the main means to resolve this contradiction.
As we all know, regular expressions are a specification that can be used for pattern matching and replacement. A regular expression is a literal pattern composed of ordinary characters (such as characters a to z) and special characters (metacharacters). It is used to describe one or more strings to be matched when searching for the body of a literal. The regular expression acts as a template to match a character pattern with the searched string.
1: What is a regular expression
1. Definition:Regular expressions are specifications that can be used for pattern matching and replacement. A regular expression is a literal pattern composed of ordinary characters (such as characters a to z) and special characters (metacharacters). It is used to describe one or more strings to be matched when searching for the body of a literal. The regular expression acts as a template to match a character pattern with the searched string.
2. Purpose:
String matching (character matching)
String search
String replacement
String segmentation
For example:
Extract email address from the web page
Is the IP address correct?
Extract links from web pages
Classes that handle regular expressions:
: Pattern class: A pattern in which a string is to be matched. The pattern itself has been compiled and is much more efficient when used.
: Matching class: This pattern matches the result generated by a certain string, and there may be many results.
4: Let’s briefly introduce regular expressions through a small program
import ; import ; public class Test { public static void main(String[] args) { //matches() determines whether the string matches a certain expression, "." means any character p("abc".matches("...")); //Replace the number in the string "a2389a" with *, \d represents the number "0--9" p("a2389a".replaceAll("\\d", "*")); //Compile any string with a--z with a--z length of 3, which can speed up matching Pattern p = ("[a-z]{3}"); //Make a match and place the matching result in the Matcher object Matcher m = ("abc"); p(()); //The above three lines of code can be replaced by the following line of code p("abc".matches("[a-z]{3}")); } public static void p(Object o){ (o); } }
Below are the print results
true
a****a
true
true
Now we will use some experiments to explain the matching rules of regular expressions. This is the Greedy method.
Any character
a? �
a* �
a+ �
a{n}?
a{n,}? �
a{n,m}? a at least n times, but not more than m times
//Preliminary understanding. * + ?
p("a".matches("."));//true
p("aa".matches("aa"));//true
p("aaaa".matches("a*"));//true
p("aaaa".matches("a+"));//true
p("".matches("a*"));//true
p("aaaa".matches("a?"));//false
p("".matches("a?"));//true
p("a".matches("a?"));//true
p("1232435463685899".matches("\\d{3,100}"));//true
p("192.168.".matches("\\d{1,3}\\.\\d{1,3}\\.\\d{1,3}\\.\\d{1,3}"));//false
p("192".matches("[0-2][0-9][0-9]"));//true
[abc] �
[^abc]
[a-zA-Z] �
[a-d[m-p]] �
[a-z&&[def]]
[a-z&&[^bc]] �
[a-z&&[^m-p]]
//scope
p("a".matches("[abc]"));//true
p("a".matches("[^abc]"));//false
p("A".matches("[a-zA-Z]"));//true
p("A".matches("[a-z]|[A-Z]"));//true
p("A".matches("[a-z[A-Z]]"));//true
p("R".matches("[A-Z&&[RFG]]"));//true
\d
\D
\s
\S
\w
\W
//Meet\s \w \d \
p("\n\r\t".matches("\\s(4)"));//false
p(" ".matches("\\S"));//false
p("a_8 ".matches("\\w(3)"));//false
p("abc888&^%".matches("[a-z]{1,3}\\d+[&^#%]+"));//true
p("\\".matches("\\\\"));//true
Boundary matcher
^ �
$
\b
\B
\A
\G
\Z
\z
//Border Match
p("hello sir".matches("^h.*"));//true
p("hello sir".matches(".*ir$"));//true
p("hello sir".matches("^h[a-z]{1,3}o\\b.*"));//true
p("hellosir".matches("^h[a-z]{1,3}o\\b.*"));//false
//Blank line: Start with one or more (blank and non-line newline) and end with a newline
p(" \n".matches("^[\\s&&[^\\n]]*\\n$"));//true
Method analysis
matches(): matches the entire string
find(): Match substring
lookingAt(): always start from the beginning of the entire string
//email
p("asdsfdfagf@".matches("[\\w[.-]]+@[\\w[.-]]+\\.[\\w]+"));//true
//matches() find() lookingAt()
Pattern p = ("\\d{3,5}");
Matcher m = ("123-34345-234-00");
// Use the regular expression engine to find a match. When the first "-" does not match, stop.
//But it will not spit out the mismatched "-"
p(());
//Spray out the mismatched "-"
();
//1: There is currently p(()); search for substring starts with "...34345-234-00"
//The first and second two will be found to "34345" and "234" and "234". The next two cannot be found to be false
//2: There are currently p(()); and (); and search for substrings starting from "123-34345-234-00"
//It will be true, true, true, false
p(());
p(()+"---"+());
p(());
p(()+"---"+());
p(());
p(()+"---"+());
p(());
//If it is not found, an exception will be reported
//p(()+"---"+());
p(());
p(());
p(());
p(());
String replacement: The following method is very flexible for string replacement
//String replacement
//Pattern.CASE_INSENSITIVE case insensitive
Pattern p = ("java",Pattern.CASE_INSENSITIVE);
Matcher m = ("java Java jAva ILoveJavA youHateJAVA adsdsfd");
//Storage strings
StringBuffer buf = new StringBuffer();
//Count odd and even numbers
int i = 0;
while(()){
i++;
if(i%2 == 0){
(buf, "java");
}else{
(buf, "JAVA");
}
}
// Without adding this sentence, the string adsdsfd will be abandoned
(buf);
p(buf);
Results print:
JAVA java JAVA ILovejava youHateJAVA adsdsfd
Grouping
//group group, group with ()
Pattern p = ("(\\d{3,5})([a-z]{2})");
String s = "123aa-34345bb-234cc-00";
Matcher m = (s);
p(());//2 groups
while(()){
p(());//There are all numbers and letters
//p((1));//Only numbers
//p((2));//Only letters
}
2. Simple use of regular expressions
Java regular expression usage