java

 

Methods in Matcher Class: 

This is one of the important class and if you are going to work with Regular Expression, you will mostly work with the methods in this class. Matcher object basically interprets the Pattern and perform search against the input String. Matcher class implements MatchResult interface.

 I will discuss here some important methods of this class. Below are important methods of this class.

The start and end Method

int start() Return start index of previous match.

int end() Returns the offset after the last character matched.

Let's understand start and end method by example. 

package com.dk.ex.reg.ex.test;

import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class RegExStartEndMethodExample {

	public static void main(String[] args) {
		testStartEndMethod();
	}
	
	public static void testStartEndMethod(){
		String regex="keep";
		String input="keepkeep and keep learning";
		Pattern pattern = Pattern.compile(regex);
		Matcher matcher = pattern.matcher(input);
		int count = 0;
		while(matcher.find()){
			count++;
			System.out.println("Match Number: "+count);
			System.out.println("Start Index: "+matcher.start());
			System.out.println("End Index: "+matcher.end());
		}
	}

}

 

Output:

Match Number: 1
Start Index: 0
End Index: 4
Match Number: 2
Start Index: 4
End Index: 8
Match Number: 3
Start Index: 13
End Index: 17

So in the above example we are searching for the word "keep" which found total three match.

Important**: From the output for Match Number 1 the Start Index is 0 and end Index is 4 and the length of the word "keep" is also 4, but as per the convention, ranges are inclusive of the beginning index and exclusive of the end index. The "end" method in reg-ex return the ending index of the matched character. 

So first match for word "keep" start with 0 and end at 4 even though the characters themselves only occupy cells 0, 1, 2 and 3.

[0:"k", 1:"e", 2:"e", 3:"p"]

 

The find Method

boolean find() Find method attempts to find the next subsequence of the input sequence that matches the pattern. We use this method in the while loop to find the multiple matches in the given input once we found the match we can obtain more information via the start, end and group methods.

Example: 

String regex="keep";
String input="keepkeep and keep learning";
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(input);
int count = 0;
while(matcher.find()){
	count++;
        System.out.println("Match Number: "+count);
	System.out.println("Start Index: "+matcher.start());
	System.out.println("End Index: "+matcher.end());
}

 

boolean find(int start) Resets this matcher and then attempts to find the next subsequence of the input sequence that matches the pattern, starting at the specified index.

Note**: If you put the same Index number in the while loop it will create an infinite loop, because it will reset the matcher to the same index, so you should use this method with care in loop (e.g. in the above example of the while loop if you use "matcher.find(4)" it will create an infinite loop). 

Example:

package com.dk.ex.reg.ex.test;

import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class RegExFindMethodWithIndex {

	public static void main(String[] args) {
		testFindMethodWithSpecificIndex();
	}
	
	public static void testFindMethodWithSpecificIndex(){
		String regex="keep";
		String input="keepkeep and keep learning";
		Pattern pattern = Pattern.compile(regex);
		Matcher matcher = pattern.matcher(input);
		matcher.find(4); // find with specified index number
		System.out.println("Start Index: "+matcher.start());
		System.out.println("End Index: "+matcher.end());
		
		matcher.find(4); // find with specified index number
		System.out.println("Start Index: "+matcher.start());
		System.out.println("End Index: "+matcher.end());
		
		matcher.find(8); // find with specified index number
		System.out.println("Start Index: "+matcher.start());
		System.out.println("End Index: "+matcher.end());
		
	}
}

 

Output:

Start Index: 4
End Index: 8
Start Index: 4
End Index: 8
Start Index: 13
End Index: 17

 

The matches and lookingAt Method

boolean matches() Matches the input against the pattern once the match succeeds you can obtain more information via start, end and other methods.

boolean lookingAt() Attempts to match the input sequence, starting at the beginning of the region, against the pattern.

The only difference between lookingAt and matches method is that mataches method match the entire region whether lookingAt not.

Let's understand this by example.

package com.dk.ex.reg.ex.test;

import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class RegExLookingAtMatchesDifferent {

	public static void main(String[] args) {
		testLookingAtAndMatches();
	}
	
	public static void testLookingAtAndMatches(){
		String regex="keep";
		String input="keepkeep and keep learning";
		Pattern pattern = Pattern.compile(regex);
		Matcher matcher = pattern.matcher(input);
		System.out.println("matches(): "+matcher.matches());
		
		System.out.println("lookingAt(): "+matcher.lookingAt());
		System.out.println("start Index: "+matcher.start());
		System.out.println("End Index: "+matcher.end());
		
		
	}
}

 

Output:

matches(): false
lookingAt(): true
start Index: 0
End Index: 4

 

The replaceFirst and replaceAll Methods

As per the name of the methods the replaceFirst method replaces the first occurrences and the replaceAll method replaces all the occurrences.

Let's understand this by Example.

package com.dk.ex.reg.ex.test;

import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class RegExReplaceFirstAndAll {

	public static void main(String[] args) {
		// TODO Auto-generated method stub
		testReplaceFirstAndAll();
	}
	public static void testReplaceFirstAndAll(){
		String regex="keep";
		String input="keep keep learning";
		String replaceText = "yes";
		Pattern pattern = Pattern.compile(regex);
		Matcher matcher = pattern.matcher(input);
		System.out.println("replaceFirst(): "+matcher.replaceFirst(replaceText));
		System.out.println("replaceAll(): "+matcher.replaceAll(replaceText));
	}

}

Output:

replaceFirst(): yes keep learning
replaceAll(): yes yes learning

 

The appendReplacement and appendTail Methods

The appendReplacement and appendTail Methods also used for text replacement. Let's understand this by example.

package com.dk.ex.reg.ex.test;

import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class RegExappendReplacementAndappendTail {

	public static void main(String[] args) {
		testAppendReplaceAndTail();
	}
	
	public static void testAppendReplaceAndTail(){
		String regex="not";
		String input="notkeepnotkeepnotlearning";
		Pattern pattern = Pattern.compile(regex);
		Matcher matcher = pattern.matcher(input);
		StringBuffer buffer = new StringBuffer();
		while(matcher.find()){
			matcher.appendReplacement(buffer, " "); // remove not with space.
		}
		matcher.appendTail(buffer);
		System.out.println("Replaced Text: "+buffer.toString());
	}

}

 

Output: 

Replaced Text:  keep keep learning

 

The methods are more frequently used methods. There are other methods in this class let's discuss them quickly. To understand these methods, I would recommend please write codes.

E.g.

Matcher usePattern(Pattern newPattern): Used to change the pattern that this Matcher uses nad return new Matcher based on the new Pattern.

Matcher reset(): Reset the current Matcher.

Matcher reset(CharSequence): Reset the Matcher with new input sequence.

There are methods for setting the limits for the matcher regeion, getting start or end index of the matcher region's.

Matcher region(int start, int end): Sets the limits of this matcher's region. The region is the part of the input sequence that will be searched to find a match. Invoking this method resets the matcher, and then sets the region to start at the index specified by the start parameter and end at the index specified by the end parameter.

int regionStart(): Reports the start index of this matcher's region. The searches this matcher conducts are limited to finding matches within regionStart (inclusive) and regionEnd (exclusive).

int regionEnd(): Reports the end index (exclusive) of this matcher's region. The searches this matcher conducts are limited to finding matches within regionStart (inclusive) and regionEnd (exclusive).

 

That's all about Matcher Class.

There is another class in Regular Expression Package used for Exception and Syntax error handling. This class is called PatternSyntaxException class. This class has following methods.

public String getDescription() Retrieves the description of the error.

public int getIndex() Retrieves the error index.

public String getPattern() Retrieves the erroneous regular expression pattern.

public String getMessage() Returns a multi-line string containing the description of the syntax error and its index, the erroneous regular expression pattern, and a visual indication of the error index within the pattern.

 

Reference:

Matcher: https://docs.oracle.com/javase/tutorial/essential/regex/matcher.html

and Java Doc API https://docs.oracle.com/javase/7/docs/api/java/util/regex/Matcher.html