java

Regular Expression in Java

Regular Expression is the way to search, edit, and manipulate text and data. There are different set of character which helps you in creating regular expression.

With help of regular-expression construct you can create complex regular expression. Here I don't want to tell you what is Regular Expression and what are different constructs because you can get all these information from Oracle docs Pattern and their tutorial Introduction to Regular Expression which is explained very well.

Regular Expressions related classes can be found under java.util.regex package which consist of three classes primarily used.

Description of below classes is from Oracle doc's tutorial.

 

  • Pattern - A Pattern object is a compiled representation of a regular expression. The Pattern class provides no public constructors. To create a pattern, you must first invoke one of its public static compile methods, which will then return a Pattern object. These methods accept a regular expression as the first argument; 
  • Matcher - A Matcher object is the engine that interprets the pattern and performs match operations against an input string. Like the Pattern class, Matcher defines no public constructors. You obtain a Matcher object by invoking the matcher method on a Pattern object. 
  • PatternSyntaxException - A PatternSyntaxException object is an unchecked exception that indicates a syntax error in a regular expression pattern.

  Let's understand Regular expression with example and use cases.

Use Case 1:

Write a Java method to validate input text in which users are allowed to enter certain set of special character. If they are entering anything else except defined set of character you have to show error message.
Let's say input field should accept only following text "#$%/\\^*,.-_+=:;?@!"

RegExPatternUtility.java

package com.dk.ex.reg.ex.test;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class RegExPatternUtility {
    /**
     * Allow only set of Special Character and not other characters. which does
     * not contains any other character in the String.
     */
    /**
     * 
     * @param pattern
     * @param data
     * @return boolean
     * 
     * You can modify this method as per your need. For example you can
     * pass set of character as an another parameter to which you don't
     * want to consider e.g. containsOnlySpecificCharacter(String
     * pattern, String excludeStrPattern, String data)
     * 
     * So this method will return true only when data will contains
     * characters from the pattern list.
     * 
     */
    public static boolean containsOnlySpecificCharacter(String pattern,
            String data) {
        Pattern specialCharPattern = Pattern.compile(pattern);
        Matcher spclCharMatcher = specialCharPattern.matcher(data);
        Pattern otherCharPattern = Pattern
                .compile("[a-zA-Z0-9<>`~'(){}\\[\\]]");
        Matcher otherCharMatcher = otherCharPattern.matcher(data);
        return (spclCharMatcher.find() && !otherCharMatcher.find());
    }
}

 

RegExExample.java

package com.dk.ex.reg.ex.test;
public class RegExExample {
    public static void main(String[] args) {
        testcontainsOnlySpecificChar();
    }
    
    public static void testcontainsOnlySpecificChar(){
        // Print false because it contains other character e.g. alphabets
        System.out.println(RegExPatternUtility
                .containsOnlySpecificCharacter("[#$%/\\^*,.-_+=:;?@!]",
                        "Dee!@pak"));
        // Print true, because data contains only expected characters.
        System.out.println(RegExPatternUtility
                .containsOnlySpecificCharacter("[#$%/\\^*,.-_+=:;?@!]",
                        "!*"));
        System.out.println(RegExPatternUtility
                .containsOnlySpecificCharacter("[#$%/\\^*,.-_+=:;?@!]",
                        "+_-;!*"));
        // Print false, because data contains numbers.
                System.out.println(RegExPatternUtility
                        .containsOnlySpecificCharacter("[#$%/\\^*,.-_+=:;?@!]",
                                "1!*4^"));
    }
}

 

In the above example RegExPatternUtility.java class contains method containsOnlySpecificCharacter()in which first parameter is the pattern which we want to include and in the same method we are excluding other text [a-zA-Z0-9<>`~'(){}\\[\\]] In this way you are allowing only set of character you want to include.

You can customize this method as per your need (i.e you can add another parameter of exclude pattern) as mentioned above in the comment of containsOnlySpecificCharacter() method and you can include and exclude any set of character.

 

Use Case 2:

Write a Java Code to find numbers (starting or ending with) in the string.

RegExPatternUtility.java 

package com.dk.ex.reg.ex.test;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class RegExPatternUtility {
            
    public static boolean isMatchExist(String pattern, String data){
        Pattern patternData = Pattern.compile(pattern);
        Matcher matcher = patternData.matcher(data);
        return matcher.find();
        
        /**
         * The above code can be writen in single line 
         * Pattern.matches(pattern, data);
         */
    }
    
}

 

RegExExample.java 

package com.dk.ex.reg.ex.test;
public class RegExExample {
    public static void main(String[] args) {
        testStringStartOrEndWithNumber();
        testStringStartOrEndWithNumbersInSinglePattern();
        testStringStartAndEndWithNumbers();
    }
    public static void testStringStartOrEndWithNumber() {
        /**
         * To test string start with number the pattern should be ^[0-9]
         * Character (^) indicate the beginning of line.
         */
        // true
        System.out.println("1 : "
                + RegExPatternUtility.isMatchExist("^[0-9]", "123string"));
        // false
        System.out.println("2 : "
                + RegExPatternUtility.isMatchExist("^[0-9]", "string123"));
        /**
         * What if you want to check string containing number at end of the
         * string Character ($) indicate the end of line, so we have to use $
         * here so regex will look like [0-9]$
         */
        // false
        System.out.println("3 : "
                + RegExPatternUtility.isMatchExist("[0-9]$", "123string"));
        // true
        System.out.println("4 : "
                + RegExPatternUtility.isMatchExist("[0-9]$", "string123"));
        /**
         * What if you want to check string containing numbers exactly N times
         * either at start or end of the string. You can use \d instead of
         * [0-9]. You can use different other pattern for matching X{n} X,
         * exactly n times X{n,} X, at least n times X{n,m} X, at least n but
         * not more than m times
         */
        // true
        System.out.println("5 : "
                + RegExPatternUtility.isMatchExist("^[0-9]{3}", "123string"));
        // false
        System.out.println("6 : "
                + RegExPatternUtility.isMatchExist("^[0-9]{3}", "12string"));
        // false numbers ar at the end.
        System.out.println("7 : "
                + RegExPatternUtility.isMatchExist("^[0-9]{3}", "string123"));
        // false
        System.out.println("8 : "
                + RegExPatternUtility.isMatchExist("\\d{3}$", "123string"));
        // true
        System.out.println("9 : "
                + RegExPatternUtility.isMatchExist("\\d{3}$", "string123"));
        // false.
        System.out.println("10 : "
                + RegExPatternUtility.isMatchExist("\\d{3}$", "string23"));
    }
    public static void testStringStartOrEndWithNumbersInSinglePattern() {
        // true
        System.out.println("1 : "
                + RegExPatternUtility.isMatchExist("^[0-9]{3}|[0-9]{3}$",
                        "123string123"));
        // true
        System.out.println("2 : "
                + RegExPatternUtility.isMatchExist("^\\d{3}|\\d{3}$",
                        "123string"));
        // true
        System.out.println("3 : "
                + RegExPatternUtility.isMatchExist("^\\d{3}|\\d{3}$",
                        "string123"));
        // false.
        System.out.println("4 : "
                + RegExPatternUtility.isMatchExist("^\\d{3}|\\d{3}$",
                        "str123ing"));
        // false.
        System.out.println("5 : "
                + RegExPatternUtility.isMatchExist("^\\d{3}|\\d{3}$",
                        "12string12"));
    }
    public static void testStringStartAndEndWithNumbers() {
        // true
        System.out.println("1 : "
                + RegExPatternUtility.isMatchExist("^[0-9]{3}\\w+[0-9]{3}$",
                        "123string123"));
        // false
        System.out.println("2 : "
                + RegExPatternUtility.isMatchExist("^[0-9]{3}\\w+[0-9]{3}$",
                        "12string123"));
        // false
        System.out.println("3 : "
                + RegExPatternUtility.isMatchExist("^[0-9]{3}\\w+[0-9]{3}$",
                        "123string12"));
        // false
        System.out.println("4 : "
                + RegExPatternUtility.isMatchExist("^[0-9]{3}\\w+[0-9]{3}$",
                        "123string"));
        // false
        System.out.println("5 : "
                + RegExPatternUtility.isMatchExist("^[0-9]{3}\\w+[0-9]{3}$",
                        "123123"));
        // false
        System.out.println("6 : "
                + RegExPatternUtility.isMatchExist("^[0-9]{3}\\w+[0-9]{3}$",
                        "string123"));
        // true
        System.out.println("7 : "
                + RegExPatternUtility.isMatchExist("^[0-9]{3}\\w+[0-9]{3}$",
                        "1234string1234"));
    }
    /**
     * Similar manner you can search for data starting or ending with text.
     * 
     */
}

 

Intersection, Union or substraction

You can perform Intersection, Union or subtraction pattern search with regular expression as below 

 

  • [0-4[6-8]] this reg-ex will matches the numbers 0,1,2,3, 4, 6, 7 and 8. Number 5 will not match (union)
  • [a-z&&[def]] d, e, or f (intersection)
  • [a-z&&[^bc]] a through z, except for b and c: [ad-z] (subtraction)
  • [a-z&&[^m-p]] a through z, and not m through p: [a-lq-z](subtraction)

 

 Intersection example 

/**
     * Intersection of character sets. For example in a-z character set 
     * you want to return true only when certain character is there in the String
     * for example you want to return true when d, e or f is there in the character set.
     * 
     * You can use Reg-ex [a-z&&[def]]
     */
    
    public static void intersection(){
        // true 
        System.out.println("1 : "
                + RegExPatternUtility.isMatchExist("[a-z&&[def]]",
                        "abcd"));
        //false
        System.out.println("2 : "
                + RegExPatternUtility.isMatchExist("[a-z&&[def]]",
                        "abc"));
    }

 

Hope this article was useful. In next Article I will cover different methods of Regex in Java.