SoFunction
Updated on 2025-03-09

Basic knowledge of Java regular expressions

As we all know, in program development, it is inevitable to encounter situations where strings need to be matched, searched, replaced, and judged. These situations are sometimes more complicated. If solved in pure encoding, it will often waste programmers' time and energy. Therefore, learning and using regular expressions have become the main means to resolve this contradiction.

As we all know, regular expressions are a specification that can be used for pattern matching and replacement. A regular expression is a literal pattern composed of ordinary characters (such as characters a to z) and special characters (metacharacters). It is used to describe one or more strings to be matched when searching for the body of a literal. The regular expression acts as a template to match a character pattern with the searched string.

1: What is a regular expression

    1. Definition:Regular expressions are specifications that can be used for pattern matching and replacement. A regular expression is a literal pattern composed of ordinary characters (such as characters a to z) and special characters (metacharacters). It is used to describe one or more strings to be matched when searching for the body of a literal. The regular expression acts as a template to match a character pattern with the searched string.

   2. Purpose:

String matching (character matching)

String search

String replacement

String segmentation

For example:

Extract email address from the web page

Is the IP address correct?

Extract links from web pages

   Classes that handle regular expressions:

: Pattern class: A pattern in which a string is to be matched. The pattern itself has been compiled and is much more efficient when used.

: Matching class: This pattern matches the result generated by a certain string, and there may be many results.

   4: Let’s briefly introduce regular expressions through a small program

import ;
import ;
public class Test {
 public static void main(String[] args) {
  //matches() determines whether the string matches a certain expression, "." means any character  p("abc".matches("..."));
  //Replace the number in the string "a2389a" with *, \d represents the number "0--9"  p("a2389a".replaceAll("\\d", "*"));
  //Compile any string with a--z with a--z length of 3, which can speed up matching  Pattern p = ("[a-z]{3}");
  //Make a match and place the matching result in the Matcher object  Matcher m = ("abc");
  p(());
  //The above three lines of code can be replaced by the following line of code  p("abc".matches("[a-z]{3}"));
 }
 public static void p(Object o){
  (o);
 }
} 

Below are the print results

true
a****a
true
true

Now we will use some experiments to explain the matching rules of regular expressions. This is the Greedy method.

Any character

a?                                                                                                                              �

a*                                                              �

a+                                                              �

a{n}?

a{n,}?                                                            �

a{n,m}?   a at least n times, but not more than m times

//Preliminary understanding. * + ?
        p("a".matches("."));//true
        p("aa".matches("aa"));//true
        p("aaaa".matches("a*"));//true
        p("aaaa".matches("a+"));//true
        p("".matches("a*"));//true
        p("aaaa".matches("a?"));//false
        p("".matches("a?"));//true
        p("a".matches("a?"));//true
        p("1232435463685899".matches("\\d{3,100}"));//true
        p("192.168.".matches("\\d{1,3}\\.\\d{1,3}\\.\\d{1,3}\\.\\d{1,3}"));//false
        p("192".matches("[0-2][0-9][0-9]"));//true

[abc]                                                             �

[^abc]

[a-zA-Z]                                                           �

[a-d[m-p]]                                                          �

[a-z&&[def]]

[a-z&&[^bc]]                                                                                                                         �

[a-z&&[^m-p]]

//scope

        p("a".matches("[abc]"));//true
        p("a".matches("[^abc]"));//false
        p("A".matches("[a-zA-Z]"));//true
        p("A".matches("[a-z]|[A-Z]"));//true
        p("A".matches("[a-z[A-Z]]"));//true
        p("R".matches("[A-Z&&[RFG]]"));//true

\d

\D

\s

\S

\w

\W

//Meet\s \w \d \
        p("\n\r\t".matches("\\s(4)"));//false
        p(" ".matches("\\S"));//false
        p("a_8 ".matches("\\w(3)"));//false
        p("abc888&^%".matches("[a-z]{1,3}\\d+[&^#%]+"));//true
        p("\\".matches("\\\\"));//true

Boundary matcher

^                                                                                                                                                                                                                                                              �

$

\b

\B

\A

\G

\Z

\z

//Border Match
        p("hello sir".matches("^h.*"));//true
        p("hello sir".matches(".*ir$"));//true
        p("hello sir".matches("^h[a-z]{1,3}o\\b.*"));//true
        p("hellosir".matches("^h[a-z]{1,3}o\\b.*"));//false
//Blank line: Start with one or more (blank and non-line newline) and end with a newline
        p(" \n".matches("^[\\s&&[^\\n]]*\\n$"));//true

Method analysis

matches(): matches the entire string

find(): Match substring

lookingAt(): always start from the beginning of the entire string

//email
        p("asdsfdfagf@".matches("[\\w[.-]]+@[\\w[.-]]+\\.[\\w]+"));//true
       
        //matches() find() lookingAt()
        Pattern p = ("\\d{3,5}");
        Matcher m = ("123-34345-234-00");
       
// Use the regular expression engine to find a match. When the first "-" does not match, stop.
//But it will not spit out the mismatched "-"
        p(());
//Spray out the mismatched "-"
        ();
       
//1: There is currently p(()); search for substring starts with "...34345-234-00"
//The first and second two will be found to "34345" and "234" and "234". The next two cannot be found to be false
//2: There are currently p(()); and (); and search for substrings starting from "123-34345-234-00"
//It will be true, true, true, false
        p(());
        p(()+"---"+());
        p(());
        p(()+"---"+());
        p(());
        p(()+"---"+());
        p(());
//If it is not found, an exception will be reported
        //p(()+"---"+());
       
        p(());
        p(());
        p(());
        p(());

String replacement: The following method is very flexible for string replacement

//String replacement
//Pattern.CASE_INSENSITIVE case insensitive
        Pattern p = ("java",Pattern.CASE_INSENSITIVE);
        Matcher m = ("java Java jAva ILoveJavA youHateJAVA adsdsfd");
//Storage strings
        StringBuffer  buf = new StringBuffer();
//Count odd and even numbers
        int i  = 0;
        while(()){
            i++;
            if(i%2 == 0){
                (buf, "java");
            }else{
                (buf, "JAVA");
            }
        }
// Without adding this sentence, the string adsdsfd will be abandoned
        (buf);
        p(buf);

Results print:

JAVA java JAVA ILovejava youHateJAVA adsdsfd

Grouping

//group group, group with ()
        Pattern p = ("(\\d{3,5})([a-z]{2})");
        String s = "123aa-34345bb-234cc-00";
        Matcher m = (s);
p(());//2 groups
        while(()){
p(());//There are all numbers and letters
//p((1));//Only numbers
//p((2));//Only letters
        }

2. Simple use of regular expressions

Java regular expression usage