regular expression

Keywords: Java regex

The java.util.regex package supports regular representation and includes two classes, Pattern (pattern class) and Matcher (matcher class). The Patterns class is the object used to express and state the search pattern used, and the Matcher class is the object that really affects the search.

1 Static method for checking strings

To determine if a string contains a runoob substring, you can use the following code

        String content = "I am noob " + "from runoob.com.";
        String pattern = ".*runoob.*";
        boolean isMatch = Pattern.matches(pattern, content);
        System.out.println("Does the string contain 'runoob' Substring? " + isMatch);

2 Pattern s class

The construction methods for these two classes are private and cannot be created directly. They can only be created by factory methods.

Pattern p=Pattern.compile("\\w+"); 
p.pattern();//Return\w+ 

Pattern s have a split(CharSequence input) method for separating strings and returning a String[]. This method splits strings with regular expressions, such as the following code splitting strings with numeric strings.

Pattern p=Pattern.compile("\\d+"); 
String[] str=p.split("My QQ yes:456456 My phone number is.:0532214 My mailbox is:aaa@aaa.com"); 

Pattern.matches(String regex,CharSequence input) is a static method for quickly matching strings, which is suitable for matching only one string and matching all strings.

Pattern.matches("\\d+","2223");//Return true 
Pattern.matches("\\d+","2223aa");//Returns false, requires matching to all strings to return true, where aa cannot match to 
Pattern.matches("\\d+","22bb23");//Returns false, requires matching to all strings to return true, where bb cannot match to 

3 Matcher Class

Matcher class construction methods are also private and cannot be created at will, but only instances of the class can be obtained by the Pattern.matcher(CharSequence input) method. Patterns can only do simple matching operations. For better and more convenient regular matching operations, you need to work with Matcher, such as group matching.

Pattern p=Pattern.compile("\\d+"); 
Matcher m=p.matcher("22bb23"); 
m.pattern();//Returning p means returning which Pattern Object created the Matcher object 

The Matcher class provides three matching operations, each returning a boolean type, true when a match is made, and false if no match is made.
matches() matches the entire string and returns true only if the entire string matches. Note that the matches method for Pattern s is implemented by Matcher's matches.

Pattern p=Pattern.compile("\\d+"); 
Matcher m=p.matcher("22bb23"); 
m.matches();//Returns false because the bb cannot be matched by \d+, resulting in an unsuccessful string match. 
Matcher m2=p.matcher("2223"); 
m2.matches();//Returns true because \d+ matches the entire string

lookingAt() matches the previous string and returns true only if the matched string is at the top

Pattern p=Pattern.compile("\\d+"); 
Matcher m=p.matcher("22bb23"); 
m.lookingAt();//Returns true because \d+ matches the previous 22 
Matcher m2=p.matcher("aa2223"); 
m2.lookingAt();//Returns false because \d+ does not match the previous aa 

find() matches strings and can match strings anywhere

Pattern p=Pattern.compile("\\d+"); 
Matcher m=p.matcher("22bb23"); 
m.find();//Return true 
Matcher m2=p.matcher("aa2223"); 
m2.find();//Return true 
Matcher m3=p.matcher("aa2223bb"); 
m3.find();//Return true 
Matcher m4=p.matcher("aabb"); 
m4.find();//Return false 

start() returns the index start position of the first substring matched in the string.
end() returns the index position at the end of the matched substring.
group() returns the matched substring.

Pattern p=Pattern.compile("\\d+"); 
Matcher m=p.matcher("aaa2223bb"); 
m.find();//Match 2223 
m.start();//Return 3 
m.end();//Returns 7, returning the index number after 2223 
m.group();//Return 2223

All start(), end(), and group() have an overloaded method: start(int i), end(int i), group(int i) for grouping operations, and the Mathcer class has a groupCount() for returning how many groups.

Pattern p=Pattern.compile("([a-z]+)(\\d+)"); 
Matcher m=p.matcher("aaa2223bb"); 
m.find();   //Match aaa2223 
m.groupCount();   //Return 2 because there are 2 groups 
m.start(1);   //Returns 0 to return the index number of the first matched substring in the string 
m.start(2);   //Return 3 
m.end(1);   //Returns 3 to return the index position of the last character of the first matched substring in the string. 
m.end(2);   //Return 7 

Four matches

A capture group is a method of processing multiple characters as a single unit, which is created by grouping characters within parentheses. For example, a regular expression (dog) creates a single grouping containing "d", "o", and "g".

Capture groups are numbered by calculating their open brackets from left to right. For example, in expression ((A) (B(C))), there are four such groups:

  • ((A)(B©))
  • (A)
  • (B©)
  • ©

You can see how many groupCount groups the expression has by calling the groupCount method of the matcher object. The groupCount method returns an int value indicating that the matcher object currently has multiple capture groups.

There is also a special group (group(0), which always represents the entire expression. This group is not included in the return value of the groupCount.

// Find in string by specified mode
      String line = "This order was placed for QT3000! OK?";
      String pattern = "(\\D*)(\\d+)(.*)";
 
      // Create a Pattern Object
      Pattern r = Pattern.compile(pattern);
 
      // Now create the matcher object
      Matcher m = r.matcher(line);
      if (m.find( )) {
         System.out.println("Found value: " + m.group(0) );
         System.out.println("Found value: " + m.group(1) );
         System.out.println("Found value: " + m.group(2) );
         System.out.println("Found value: " + m.group(3) ); 
      } else {
         System.out.println("NO MATCH");
      }

Posted by irishpeck on Mon, 27 Sep 2021 09:32:38 -0700