Whitespace Matching Regex - Java

Whitespace Matching Regex - Java

Yeah, you need to grab the result of matcher.replaceAll():

String result = matcher.replaceAll(" ");
System.out.println(result);

Make Regex Match Whitespaces in Java

\s matches a white-space character and when this is used in a java string you need to escape the \ so it would be \\s. If you want to match zero-or-more white-space then use \\s*.

This will match a single domain and TLD:

([A-Za-z][A-Za-z0-9.\\-_]*)\\s*(at)\\s*([A-Za-z][A-Za-z0-9\\-_()]*\\s*(dot)\\s*[A-Za-z]+)

However, you are trying to match multiple levels of sub-domains so you need to wrap the domain part of the regular expression ([A-Za-z][A-Za-z0-9\\-_()]*\\s*(dot)\\s* in ()+ to get one-or-more of them:

([A-Za-z][A-Za-z0-9.\\-_]*)\\s*(at)\\s*(([A-Za-z][A-Za-z0-9\\-_()]*\\s*(dot)\\s*)+[A-Za-z]+)
^ ^^

Something like this:

public class RegexpMatch {
static Pattern Regex = Pattern.compile(
"([A-Za-z][A-Za-z0-9.\\-_]*)\\s*(at)\\s*(([A-Za-z][A-Za-z0-9\\-_()]*\\s*(dot)\\s*)+[A-Za-z]+)"
);

public static void main( final String[] args ){
final String[] tests = {
"abcdatcsdotuniversitydotedu",
"abcd at cs dot university dot edu"
};

for ( final String test : tests )
System.out.println( test + " - " + ( Regex.matcher( test ).matches() ? "Match" : "No Match" ) );
}
}

Which outputs:

abcdatcsdotuniversitydotedu - Match
abcd at cs dot university dot edu - Match

Regular expression match fails if only whitespace after the - character

Something like,

^\d+\.\d+\.\d+(?:\s*-\s*\w+)?\/\d+\.\d+\.\d+\.\d+(?:\s*-\s*\w+)?.txt$

Or you can combine the \.\d+ repetitions as

^\d+(?:\.\d+){2}(?:\s*-\s*\w+)?\/\d+(?:\.\d+){3}(?:\s*-\s*\w+)?.txt$

Regex Demo


Changes

  • .{1} When you want to repeat something once, no need for {}. Its implicit

  • (?:\s*-\s*\w+) Matches zero or more space (\s*) followed by -, another space and then \w+ a description of length greater than 1

    • The ? at the end of this patterns makes this optional.
    • This same pattern is repeated again at the end to match the second part.
  • ^ Anchors the regex at the start of the string.
  • $ Anchors the regex at the end of the string. These two are necessary so that there is nothing other in the string.
  • Don't group the patterns using () unless it is necessary to capture them. This can lead to wastage of memory. Use (?:..) If you want to group patterns but not capture them

Regex match a whitespace table

You may trim the input first, and then split with 3 or more whitespaces, then check if you got the first 2 cell values and use them :

String s = "           here is a $$ cell               here  another         cells I dont care about.........";
String[] res = s.trim().split("\\s{3,}");
if (res.length > 1) {
System.out.println(res[0]); // Item 1
System.out.println(res[1]); // Item 2, the rest is unimportant
}

See the Java demo

RegEx to match lines consisting of whitespace only

Use string.trim().isEmpty() to check for length 0 after trimming leading and trailing whitespaces.

Java Regex match space before or after characters

You may use

^(?!\s)(?!.*\s[*?])(?!.*[*?]\s)(?:[?\s]*[a-zA-Z0-9]){2}[a-zA-Z0-9?\s]*\*?$

See the regex demo.

Usage note: if you use it with Java's .matches() method, the ^ and $ can be removed from the pattern. Remember to double escape backslashes in the string literal.

Details

  • ^ - start of string
  • (?!\s) - no whitespace is allowed immediately to the right (at the start of the string)
  • (?!.*\s[*?]) - no whitespace is allowed after any 0+ chars, as many as possible, before * or ?
  • (?!.*[*?]\s) - no whitespace is allowed after any 0+ chars, as many as possible, after * or ?
  • (?:[?\s]*[a-zA-Z0-9]){2} - two sequences of

    • [?\s]* - 0 or more ? or/and whitespaces
    • [a-zA-Z0-9] - an alphanumeric char
  • [a-zA-Z0-9?\s]* - 0 or more letters, digits, ? or whitespaces
  • \*? - an optional ? char
  • $ - end of the string.

How to express several whitespaces in a regex?

The \s metacharacter should account for tabs. What you're missing is the quantifier. You need a + or * quantifier (depending on whether you allow no space between the two segments) to detect any number of whitespaces.

java regex find all whitespace in a string

You can use negative lookahead to check for spaces:

^(?!.* )

^ - Start matching at the beginning of the string.

(?! - Begin a negative lookahead group (the pattern inside the parentheses must not come next.

.* - Any non-newline character any number of times followed by a space.

) - Close the negative lookahead group.

Combined with the full regex pattern (also cleaned up a bit to remove redundancy):

^(?!.* )(?=.*[a-z])(?=.*[A-Z])(?=.*\\d)(?=.*[!@$%&*?])[A-Za-z\\d!@$%&*?]+

Java regex match match whitespace and non-whitespace characters 2 times

There are two problems in your code.

  • You need to always call, matches() or find() before invoking .group() methods on matcher object.
  • Second your regex is incorrectly grouped.

Currently your group will only give one/last match, so instead you need to wrap whole of your expression into group. The correct regex you need is this,

.*\n((?:\\S+\\s+){2})(.*)

Try this Java codes,

String pattern = ".*\n((?:\\S+\\s+){2})(.*)";
String str = "Filesystem 1MB-blocks Used Available Use% Mounted on\n" +
"/dev/sda6 72342MB 5013MB 63655MB 8% /common";
Pattern r = Pattern.compile(pattern, Pattern.DOTALL);
Matcher m = r.matcher(str);
if (m.matches()) {
System.out.println(m.group(1));
}

Prints,

/dev/sda6         72342MB 

How to ignore whitespace in regular expression matching

Convert your regex string to

mark|anthony|joseph\s*smith|michael

i.e. replace all spaces with \s* to match zero or more whitespace characters. If you want to match ONLY space (0x20) then make it

mark|anthony|joseph *smith|michael


Related Topics



Leave a reply



Submit