java

Regular Expression in Java

Regular Expression is the way to search, edit, and manipulate text and data. There are different set of character which helps you in creating regular expression.

With help of regular-expression construct you can create complex regular expression. Here I don't want to tell you what is Regular Expression and what are different constructs because you can get all these information from Oracle docs Pattern and their tutorial Introduction to Regular Expression which is explained very well.

Regular Expressions related classes can be found under java.util.regex package which consist of three classes primarily used.

Description of below classes is from Oracle doc's tutorial.

 

  • Pattern - A Pattern object is a compiled representation of a regular expression. The Pattern class provides no public constructors. To create a pattern, you must first invoke one of its public static compile methods, which will then return a Pattern object. These methods accept a regular expression as the first argument; 
  • Matcher - A Matcher object is the engine that interprets the pattern and performs match operations against an input string. Like the Pattern class, Matcher defines no public constructors. You obtain a Matcher object by invoking the matcher method on a Pattern object. 
  • PatternSyntaxException - A PatternSyntaxException object is an unchecked exception that indicates a syntax error in a regular expression pattern.

  Let's understand Regular expression with example and use cases.

Use Case 1:

Write a Java method to validate input text in which users are allowed to enter certain set of special character. If they are entering anything else except defined set of character you have to show error message.
Let's say input field should accept only following text "#$%/\\^*,.-_+=:;?@!"

RegExPatternUtility.java

package com.dk.ex.reg.ex.test;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class RegExPatternUtility {
    /**
     * Allow only set of Special Character and not other characters. which does
     * not contains any other character in the String.
     */
    /**
     * 
     * @param pattern
     * @param data
     * @return boolean
     * 
     * You can modify this method as per your need. For example you can
     * pass set of character as an another parameter to which you don't
     * want to consider e.g. containsOnlySpecificCharacter(String
     * pattern, String excludeStrPattern, String data)
     * 
     * So this method will return true only when data will contains
     * characters from the pattern list.
     * 
     */
    public static boolean containsOnlySpecificCharacter(String pattern,
            String data) {
        Pattern specialCharPattern = Pattern.compile(pattern);
        Matcher spclCharMatcher = specialCharPattern.matcher(data);
        Pattern otherCharPattern = Pattern
                .compile("[a-zA-Z0-9<>`~'(){}\\[\\]]");
        Matcher otherCharMatcher = otherCharPattern.matcher(data);
        return (spclCharMatcher.find() && !otherCharMatcher.find());
    }
}

 

RegExExample.java

package com.dk.ex.reg.ex.test;
public class RegExExample {
    public static void main(String[] args) {
        testcontainsOnlySpecificChar();
    }
    
    public static void testcontainsOnlySpecificChar(){
        // Print false because it contains other character e.g. alphabets
        System.out.println(RegExPatternUtility
                .containsOnlySpecificCharacter("[#$%/\\^*,.-_+=:;?@!]",
                        "Dee!@pak"));
        // Print true, because data contains only expected characters.
        System.out.println(RegExPatternUtility
                .containsOnlySpecificCharacter("[#$%/\\^*,.-_+=:;?@!]",
                        "!*"));
        System.out.println(RegExPatternUtility
                .containsOnlySpecificCharacter("[#$%/\\^*,.-_+=:;?@!]",
                        "+_-;!*"));
        // Print false, because data contains numbers.
                System.out.println(RegExPatternUtility
                        .containsOnlySpecificCharacter("[#$%/\\^*,.-_+=:;?@!]",
                                "1!*4^"));
    }
}

 

In the above example RegExPatternUtility.java class contains method containsOnlySpecificCharacter()in which first parameter is the pattern which we want to include and in the same method we are excluding other text [a-zA-Z0-9<>`~'(){}\\[\\]] In this way you are allowing only set of character you want to include.

You can customize this method as per your need (i.e you can add another parameter of exclude pattern) as mentioned above in the comment of containsOnlySpecificCharacter() method and you can include and exclude any set of character.

 

Use Case 2:

Write a Java Code to find numbers (starting or ending with) in the string.

RegExPatternUtility.java 

package com.dk.ex.reg.ex.test;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class RegExPatternUtility {
            
    public static boolean isMatchExist(String pattern, String data){
        Pattern patternData = Pattern.compile(pattern);
        Matcher matcher = patternData.matcher(data);
        return matcher.find();
        
        /**
         * The above code can be writen in single line 
         * Pattern.matches(pattern, data);
         */
    }
    
}

 

RegExExample.java 

package com.dk.ex.reg.ex.test;
public class RegExExample {
    public static void main(String[] args) {
        testStringStartOrEndWithNumber();
        testStringStartOrEndWithNumbersInSinglePattern();
        testStringStartAndEndWithNumbers();
    }
    public static void testStringStartOrEndWithNumber() {
        /**
         * To test string start with number the pattern should be ^[0-9]
         * Character (^) indicate the beginning of line.
         */
        // true
        System.out.println("1 : "
                + RegExPatternUtility.isMatchExist("^[0-9]", "123string"));
        // false
        System.out.println("2 : "
                + RegExPatternUtility.isMatchExist("^[0-9]", "string123"));
        /**
         * What if you want to check string containing number at end of the
         * string Character ($) indicate the end of line, so we have to use $
         * here so regex will look like [0-9]$
         */
        // false
        System.out.println("3 : "
                + RegExPatternUtility.isMatchExist("[0-9]$", "123string"));
        // true
        System.out.println("4 : "
                + RegExPatternUtility.isMatchExist("[0-9]$", "string123"));
        /**
         * What if you want to check string containing numbers exactly N times
         * either at start or end of the string. You can use \d instead of
         * [0-9]. You can use different other pattern for matching X{n} X,
         * exactly n times X{n,} X, at least n times X{n,m} X, at least n but
         * not more than m times
         */
        // true
        System.out.println("5 : "
                + RegExPatternUtility.isMatchExist("^[0-9]{3}", "123string"));
        // false
        System.out.println("6 : "
                + RegExPatternUtility.isMatchExist("^[0-9]{3}", "12string"));
        // false numbers ar at the end.
        System.out.println("7 : "
                + RegExPatternUtility.isMatchExist("^[0-9]{3}", "string123"));
        // false
        System.out.println("8 : "
                + RegExPatternUtility.isMatchExist("\\d{3}$", "123string"));
        // true
        System.out.println("9 : "
                + RegExPatternUtility.isMatchExist("\\d{3}$", "string123"));
        // false.
        System.out.println("10 : "
                + RegExPatternUtility.isMatchExist("\\d{3}$", "string23"));
    }
    public static void testStringStartOrEndWithNumbersInSinglePattern() {
        // true
        System.out.println("1 : "
                + RegExPatternUtility.isMatchExist("^[0-9]{3}|[0-9]{3}$",
                        "123string123"));
        // true
        System.out.println("2 : "
                + RegExPatternUtility.isMatchExist("^\\d{3}|\\d{3}$",
                        "123string"));
        // true
        System.out.println("3 : "
                + RegExPatternUtility.isMatchExist("^\\d{3}|\\d{3}$",
                        "string123"));
        // false.
        System.out.println("4 : "
                + RegExPatternUtility.isMatchExist("^\\d{3}|\\d{3}$",
                        "str123ing"));
        // false.
        System.out.println("5 : "
                + RegExPatternUtility.isMatchExist("^\\d{3}|\\d{3}$",
                        "12string12"));
    }
    public static void testStringStartAndEndWithNumbers() {
        // true
        System.out.println("1 : "
                + RegExPatternUtility.isMatchExist("^[0-9]{3}\\w+[0-9]{3}$",
                        "123string123"));
        // false
        System.out.println("2 : "
                + RegExPatternUtility.isMatchExist("^[0-9]{3}\\w+[0-9]{3}$",
                        "12string123"));
        // false
        System.out.println("3 : "
                + RegExPatternUtility.isMatchExist("^[0-9]{3}\\w+[0-9]{3}$",
                        "123string12"));
        // false
        System.out.println("4 : "
                + RegExPatternUtility.isMatchExist("^[0-9]{3}\\w+[0-9]{3}$",
                        "123string"));
        // false
        System.out.println("5 : "
                + RegExPatternUtility.isMatchExist("^[0-9]{3}\\w+[0-9]{3}$",
                        "123123"));
        // false
        System.out.println("6 : "
                + RegExPatternUtility.isMatchExist("^[0-9]{3}\\w+[0-9]{3}$",
                        "string123"));
        // true
        System.out.println("7 : "
                + RegExPatternUtility.isMatchExist("^[0-9]{3}\\w+[0-9]{3}$",
                        "1234string1234"));
    }
    /**
     * Similar manner you can search for data starting or ending with text.
     * 
     */
}

 

Intersection, Union or substraction

You can perform Intersection, Union or subtraction pattern search with regular expression as below 

 

  • [0-4[6-8]] this reg-ex will matches the numbers 0,1,2,3, 4, 6, 7 and 8. Number 5 will not match (union)
  • [a-z&&[def]] d, e, or f (intersection)
  • [a-z&&[^bc]] a through z, except for b and c: [ad-z] (subtraction)
  • [a-z&&[^m-p]] a through z, and not m through p: [a-lq-z](subtraction)

 

 Intersection example 

/**
     * Intersection of character sets. For example in a-z character set 
     * you want to return true only when certain character is there in the String
     * for example you want to return true when d, e or f is there in the character set.
     * 
     * You can use Reg-ex [a-z&&[def]]
     */
    
    public static void intersection(){
        // true 
        System.out.println("1 : "
                + RegExPatternUtility.isMatchExist("[a-z&&[def]]",
                        "abcd"));
        //false
        System.out.println("2 : "
                + RegExPatternUtility.isMatchExist("[a-z&&[def]]",
                        "abc"));
    }

 

Hope this article was useful. In next Article I will cover different methods of Regex in Java.

Add a comment

Common Methods for Regular Expression in Java

In this Article I will try to cover common methods of Regular Expression in Java and understand them with example.

Regular Expressions related classes can be found under java.util.regex package which consist of three classes primarily used. 

Pattern: Pattern class is basically a compiled representation of Regular Expression into a Pattern. This class is the start point for working with regular expression.

Methods in Pattern class

Pattern compile(String regex): This method is the start point to work with Regular Expression, the method compiles given Regular expression in Pattern. 

 Pattern p = Pattern.compile("a*b");

 

Pattern compile(String regex, int flags):  The second version of method accept another parameter called flags, it basically helps in enabling or you can say awareness to the compiled Pattern that how pattern should match the data.

Different type of flags 

/**
For example if you want to enable case
insensitive matching then flags you can use
CASE_INSENSITIVE flag
*/ 
Pattern p = Pattern.compile("a*b", Pattern.CASE_INSENSITIVE);

 

Matcher matcher(CharSequence input): Creates a Matcher object that will match to the given input against the given reg-ex compiled Pattern. We will discuss about Matcher in detail. 

Pattern pattern = Pattern.compile(pattern);
Matcher matcher = pattern.matcher(data);

 

boolean matches(String regex,CharSequence input)This method compile the given regular expression and match with the given input against it. 

boolean match = Pattern.matches("[a-z]","abcd");
/*
The same can be achieved in this way as well
*/
Pattern pattern = Pattern.compile("[a-z]");
Matcher matcher = pattern.matcher("abcd");
boolean match = matcher.matches();
//or
matcher.find();

 

String[] split(CharSequence input)This method split the given input based on the regular expression pattern. This method is useful in tokenizing the String based on certain delimiter.

Note: Trailing empty strings are therefore not included in the resulting array. 

Pattern pattern = Pattern.compile(":");
String [] states = pattern.split("CA:OH:GA");
// output
// states = {"CA", "OH", "GA"}
/*
Explanation on Note:
For example: For a given input "foo:bar:foo" if Reg-ex
is "o" then out put will be {"f", "", ":bar:f"},
so foo at the start got split into "f" and "" (empty) string
but for the word "foo" at end of the input string
it doesn't included the empty string
*/

 

String[] split(CharSequence input, int limit):This method also split the given input based on the regular expression pattern. But the second parameter "limit" has different role here. It actually controls the number of times the pattern should be applied on input data and which in turns effect the length of the resulting array. If the limit n is greater than zero then the pattern will be applied at most n - 1 times, the array's length will be no greater than n, and the array's last entry will contain all input beyond the last matched delimiter. If n is non-positive then the pattern will be applied as many times as possible and the array can have any length. If n is zero then the pattern will be applied as many times as possible, the array can have any length, and trailing empty strings will be discarded.

Example

import java.util.regex.Pattern;

public class TestRegExSplit {

	public static void main(String[] args) {
		Pattern pattern = Pattern.compile(":");
		//RegEx : and limit : 2
		String [] results = pattern.split("foo:bar:foo", 2);
		printResult(results); //{foo,bar:foo}
		
		results = pattern.split("foo:bar:foo", 5);
		printResult(results); //{foo,bar,foo}
		
		results = pattern.split("foo:bar:foo", -2);
		printResult(results); //{foo,bar,foo}
		
		pattern = Pattern.compile("o");
		results = pattern.split("foo:bar:foo", 5);
		printResult(results); //{f,,:bar:f,,}
		
		results = pattern.split("foo:bar:foo", -2);
		printResult(results); //{f,,:bar:f,,}
		
		results = pattern.split("foo:bar:foo", 0);
		printResult(results); //{f,,:bar:f}
	}
	
	private static void printResult(String [] results){
		StringBuilder builder = new StringBuilder();
		builder.append("{");
		for (String result : results) {
			builder.append(result);
			builder.append(",");
		}
		builder.deleteCharAt(builder.length()-1);
		builder.append("}");
		
		System.out.println(builder.toString());
	}

 

Flags Constants: 

Modifier and Type Field and Description
static int CANON_EQ
Enables canonical equivalence.
static int CASE_INSENSITIVE
Enables case-insensitive matching.
static int COMMENTS
Permits whitespace and comments in pattern.
static int DOTALL
Enables dotall mode.
static int LITERAL
Enables literal parsing of the pattern.
static int MULTILINE
Enables multiline mode.
static int UNICODE_CASE
Enables Unicode-aware case folding.
static int UNICODE_CHARACTER_CLASS
Enables the Unicode version of Predefined character classes and POSIX character classes.
static int UNIX_LINES
Enables Unix lines mode.

 

Methods in Matcher Class: 

This is one of the important class and if you are going to work with Regular Expression, you will mostly work with the methods in this class. Matcher object basically interprets the Pattern and perform search against the input String. Matcher class implements MatchResult interface.

 I will discuss here some important methods of this class. Below are important methods of this class.

The start and end Method

int start() Return start index of previous match.

int end() Returns the offset after the last character matched.

Let's understand start and end method by example. 

package com.dk.ex.reg.ex.test;

import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class RegExStartEndMethodExample {

	public static void main(String[] args) {
		testStartEndMethod();
	}
	
	public static void testStartEndMethod(){
		String regex="keep";
		String input="keepkeep and keep learning";
		Pattern pattern = Pattern.compile(regex);
		Matcher matcher = pattern.matcher(input);
		int count = 0;
		while(matcher.find()){
			count++;
			System.out.println("Match Number: "+count);
			System.out.println("Start Index: "+matcher.start());
			System.out.println("End Index: "+matcher.end());
		}
	}

}

 

Output:

Match Number: 1
Start Index: 0
End Index: 4
Match Number: 2
Start Index: 4
End Index: 8
Match Number: 3
Start Index: 13
End Index: 17

So in the above example we are searching for the word "keep" which found total three match.

Important**: From the output for Match Number 1 the Start Index is 0 and end Index is 4 and the length of the word "keep" is also 4, but as per the convention, ranges are inclusive of the beginning index and exclusive of the end index. The "end" method in reg-ex return the ending index of the matched character. 

So first match for word "keep" start with 0 and end at 4 even though the characters themselves only occupy cells 0, 1, 2 and 3.

[0:"k", 1:"e", 2:"e", 3:"p"]

 

The find Method

boolean find() Find method attempts to find the next subsequence of the input sequence that matches the pattern. We use this method in the while loop to find the multiple matches in the given input once we found the match we can obtain more information via the start, end and group methods.

Example: 

String regex="keep";
String input="keepkeep and keep learning";
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(input);
int count = 0;
while(matcher.find()){
	count++;
        System.out.println("Match Number: "+count);
	System.out.println("Start Index: "+matcher.start());
	System.out.println("End Index: "+matcher.end());
}

 

boolean find(int start) Resets this matcher and then attempts to find the next subsequence of the input sequence that matches the pattern, starting at the specified index.

Note**: If you put the same Index number in the while loop it will create an infinite loop, because it will reset the matcher to the same index, so you should use this method with care in loop (e.g. in the above example of the while loop if you use "matcher.find(4)" it will create an infinite loop). 

Example:

package com.dk.ex.reg.ex.test;

import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class RegExFindMethodWithIndex {

	public static void main(String[] args) {
		testFindMethodWithSpecificIndex();
	}
	
	public static void testFindMethodWithSpecificIndex(){
		String regex="keep";
		String input="keepkeep and keep learning";
		Pattern pattern = Pattern.compile(regex);
		Matcher matcher = pattern.matcher(input);
		matcher.find(4); // find with specified index number
		System.out.println("Start Index: "+matcher.start());
		System.out.println("End Index: "+matcher.end());
		
		matcher.find(4); // find with specified index number
		System.out.println("Start Index: "+matcher.start());
		System.out.println("End Index: "+matcher.end());
		
		matcher.find(8); // find with specified index number
		System.out.println("Start Index: "+matcher.start());
		System.out.println("End Index: "+matcher.end());
		
	}
}

 

Output:

Start Index: 4
End Index: 8
Start Index: 4
End Index: 8
Start Index: 13
End Index: 17

 

The matches and lookingAt Method

boolean matches() Matches the input against the pattern once the match succeeds you can obtain more information via start, end and other methods.

boolean lookingAt() Attempts to match the input sequence, starting at the beginning of the region, against the pattern.

The only difference between lookingAt and matches method is that mataches method match the entire region whether lookingAt not.

Let's understand this by example.

package com.dk.ex.reg.ex.test;

import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class RegExLookingAtMatchesDifferent {

	public static void main(String[] args) {
		testLookingAtAndMatches();
	}
	
	public static void testLookingAtAndMatches(){
		String regex="keep";
		String input="keepkeep and keep learning";
		Pattern pattern = Pattern.compile(regex);
		Matcher matcher = pattern.matcher(input);
		System.out.println("matches(): "+matcher.matches());
		
		System.out.println("lookingAt(): "+matcher.lookingAt());
		System.out.println("start Index: "+matcher.start());
		System.out.println("End Index: "+matcher.end());
		
		
	}
}

 

Output:

matches(): false
lookingAt(): true
start Index: 0
End Index: 4

 

The replaceFirst and replaceAll Methods

As per the name of the methods the replaceFirst method replaces the first occurrences and the replaceAll method replaces all the occurrences.

Let's understand this by Example.

package com.dk.ex.reg.ex.test;

import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class RegExReplaceFirstAndAll {

	public static void main(String[] args) {
		// TODO Auto-generated method stub
		testReplaceFirstAndAll();
	}
	public static void testReplaceFirstAndAll(){
		String regex="keep";
		String input="keep keep learning";
		String replaceText = "yes";
		Pattern pattern = Pattern.compile(regex);
		Matcher matcher = pattern.matcher(input);
		System.out.println("replaceFirst(): "+matcher.replaceFirst(replaceText));
		System.out.println("replaceAll(): "+matcher.replaceAll(replaceText));
	}

}

Output:

replaceFirst(): yes keep learning
replaceAll(): yes yes learning

 

The appendReplacement and appendTail Methods

The appendReplacement and appendTail Methods also used for text replacement. Let's understand this by example.

package com.dk.ex.reg.ex.test;

import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class RegExappendReplacementAndappendTail {

	public static void main(String[] args) {
		testAppendReplaceAndTail();
	}
	
	public static void testAppendReplaceAndTail(){
		String regex="not";
		String input="notkeepnotkeepnotlearning";
		Pattern pattern = Pattern.compile(regex);
		Matcher matcher = pattern.matcher(input);
		StringBuffer buffer = new StringBuffer();
		while(matcher.find()){
			matcher.appendReplacement(buffer, " "); // remove not with space.
		}
		matcher.appendTail(buffer);
		System.out.println("Replaced Text: "+buffer.toString());
	}

}

 

Output: 

Replaced Text:  keep keep learning

 

The methods are more frequently used methods. There are other methods in this class let's discuss them quickly. To understand these methods, I would recommend please write codes.

E.g.

Matcher usePattern(Pattern newPattern): Used to change the pattern that this Matcher uses nad return new Matcher based on the new Pattern.

Matcher reset(): Reset the current Matcher.

Matcher reset(CharSequence): Reset the Matcher with new input sequence.

There are methods for setting the limits for the matcher regeion, getting start or end index of the matcher region's.

Matcher region(int start, int end): Sets the limits of this matcher's region. The region is the part of the input sequence that will be searched to find a match. Invoking this method resets the matcher, and then sets the region to start at the index specified by the start parameter and end at the index specified by the end parameter.

int regionStart(): Reports the start index of this matcher's region. The searches this matcher conducts are limited to finding matches within regionStart (inclusive) and regionEnd (exclusive).

int regionEnd(): Reports the end index (exclusive) of this matcher's region. The searches this matcher conducts are limited to finding matches within regionStart (inclusive) and regionEnd (exclusive).

 

That's all about Matcher Class.

There is another class in Regular Expression Package used for Exception and Syntax error handling. This class is called PatternSyntaxException class. This class has following methods.

public String getDescription() Retrieves the description of the error.

public int getIndex() Retrieves the error index.

public String getPattern() Retrieves the erroneous regular expression pattern.

public String getMessage() Returns a multi-line string containing the description of the syntax error and its index, the erroneous regular expression pattern, and a visual indication of the error index within the pattern.

 

Reference:

Matcher: https://docs.oracle.com/javase/tutorial/essential/regex/matcher.html

and Java Doc API https://docs.oracle.com/javase/7/docs/api/java/util/regex/Matcher.html

 

 

Add a comment

 Regular Expression Q&A 

I am trying to present this article in Q&A format on Regular Expression.

 

Q: How to find/ignore Metacharacters in Regular Expression?

Metacharacters are the characters with special meanings which is interpreted by matcher. Metacharacter changes the meaning of patterns.

The list of Metacharacters supported by Java API's are: <([{\^-=$!|]})?*+.>

For example:


^ The beginning of a line
$ The end of a line
X? X, once or not at all
X* X, zero or more times
X+ X, one or more times
X{n} X, exactly n times
X{n,} X, at least n times
X{n,m} X, at least n but not more than m times

So coming to the answer of this Question, You can find or ignore the meanings of special characters in following ways:

  • Escape the metacharacters with backslash (\): This will need you to iterate through the String pattern and replace it with backslash
  • You can quote the expression using Pattern.quote(regexPattern) or quote the string in "\Q"+regEx+"\E".

Code Example: 

package test;

import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class RegExMetaCharacter {

	public static void main(String[] args) {
		String input = "Dogs are wonderful$";
		String regEx1 = "Dog.";
		String regEx2 = "wonderful$";
		
		Pattern p1 = Pattern.compile(regEx1);
		Matcher m1 = p1.matcher(input);
		//It's found the match, because dot(.) means match any character.
		System.out.println("***Before Escaping/Quoting the Metacharacters***\n");
		System.out.println(m1.find()); //true
		System.out.println("Start: "+m1.start());
		System.out.println("End: "+m1.end());
		
		Pattern p2 = Pattern.compile(regEx2);
		Matcher m2 = p2.matcher(input);
		// It's not able to find the wonderful$ character, 
		//because $ has special meaning which means end of line.
		System.out.println(m2.find());
		
		/*
		 * To let this regex work properly, you have to escape the characters
		 * and I am going to use the easy way of escaping the characters which has special meaning.
		 */
		System.out.println("\n***After Escaping/Quoting the Metacharacters***\n");
		regEx1 = Pattern.quote(regEx1);
		p1 = Pattern.compile(regEx1);
		m1 = p1.matcher(input);
		//It's found the match, because dot(.) means match any character.
		System.out.println(m1.find()); // false
		/*
		 * Commented that because it will throw IllegalStateException
		 */
		//System.out.println("Start: "+m1.start());
		//System.out.println("End: "+m1.end());
		regEx2 = Pattern.quote(regEx2);
		p2 = Pattern.compile(regEx2);
		m2 = p2.matcher(input);
		System.out.println(m2.find()); //true
		System.out.println("Start: "+m2.start());
		System.out.println("End: "+m2.end());
	}

}

 

Output: 

***Before Escaping/Quoting the Metacharacters***

true
Start: 0
End: 4
false

***After Escaping/Quoting the Metacharacters***

false
true
Start: 9
End: 19

 

Q: What is the easiest way of finding any characters from the following set of Characters !"#$%&'()*+,-./:;<=>?@[\]^_`{|}~ )?

For this you can write a regular expression for finding the characters in the above set but there is another way of finding characters in this set.
You have to create your pattern as Pattern.compile("\p{Punct}"), which is called Punctuation characters.

Code Example:

package test;

import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class RegExPunctuationPattern {

	public static void main(String[] args) {
		String input="#Deepak$%&";
		Pattern p = Pattern.compile("\\p{Punct}");
		Matcher match = p.matcher(input);
		System.out.println(match.find());
		System.out.println(match.replaceAll(""));
	}

}

Output:

true
Deepak

 

How to match a pattern exactly followed by the same pattern?

You can achieve this by Backreferences concept in Regular Expression. So what is Backreferences?

Basically when we match input string with regular expression then the section of the input matching the capturing group(s) get saved in the memory for later recall via backreferences. You can specify Backreferences in the regular expression by backslash (\) followed by digit indicating the number of the group to be recalled.

So to Answer the above question let's say you want to match two digits followed by the same exact two digits, you would use (\d\d)\1 as the regular expression:

Code Example:

 

package test;

import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class RegExBackReference {
	public static void main(String ...args){
		String input ="00111100131341313";
		String regEx= "(\\d\\d)\\1";
		Pattern p = Pattern.compile(regEx);
		Matcher m = p.matcher(input);
		while(m.find()){
			System.out.println("Start: "+m.start());
			System.out.println("End: "+m.end());
		}
	
	}
}

 

Output:

Start: 2
End: 6
Start: 8
End: 12
Start: 13
End: 17

 

Q: Write a pattern which return true only when it contains letters?

You can achieve this with below pattern. 

public static boolean containsOnlyAlphabets(String data){
		Pattern regEx = Pattern.compile("^[A-Za-z]+$");
		Matcher matcher = regEx.matcher(data);
		return matcher.find();
  }
  
 System.out.println(containsOnlyAlphabets("test-alpha")); //false contains dash (-) character
 System.out.println(containsOnlyAlphabets("test alpha")); //false contains space character
 System.out.println(containsOnlyAlphabets("testalpha")); //true contains only alphabets

 

Similarly you can create pattern for Numbers ("^[0-9]+$"), which only contains Numbers.

 

I will keep updating the article, if I found any other scenarios. 

Add a comment