Java String.Split() Regex

Java String.split() Regex

String[] ops = str.split("\\s*[a-zA-Z]+\\s*");
String[] notops = str.split("\\s*[^a-zA-Z]+\\s*");
String[] res = new String[ops.length+notops.length-1];
for(int i=0; i<res.length; i++) res[i] = i%2==0 ? notops[i/2] : ops[i/2+1];

This should do it. Everything nicely stored in res.

Java - How split(regex, limit) method actually works?

What i understand from the documentation :

The limit parameter controls the number of times the pattern is
applied and therefore affects the length of the resulting array. If
the limit n is greater than zero then the pattern will be applied at
most n - 1 times, the array's length will be no greater than n, and
the array's last entry will contain all input beyond the last matched
delimiter
. If n is non-positive then the pattern will be applied as
many times as possible and the array can have any length. If n is zero
then the pattern will be applied as many times as possible, the array
can have any length, and trailing empty strings will be discarded.

This mean devise or cut it to n time on string s, so Lets analyse one by one to understand better :

Limit 1

String[] spl1 = str.split("o", 1);

This mean split it or cut it on just one string on the string o in this case you will get all your input :

[boo:and:foo]
1

Limit 2

String[] spl1 = str.split("o", 2);

Which mean cut it one time on o so i will put a break in the first o

    boo:and:foo
-----^

in this case you will get two results :

[b,o:and:foo]
1 2

Limit 3

String[] spl1 = str.split("o", 3);

Which mean cut it two times on the first o and on the second o

    boo:and:foo
1----^^--------------2

in this case you will get three results :

[b, ,:and:foo]
1 2 3

Limit 4

String[] spl1 = str.split("o", 4);

Which mean cut it three times on the first, second and third o

     boo:and:foo
1_____^^ ^
|___2 |___3

in this case you will get four results :

[b, ,:and:f,o]
1 2 3 4

Limit 5

String[] spl1 = str.split("o", 5);

Which mean cut it four times on first, second, third and forth o

     boo:and:foo
1_____^^ ^^
|___2 ||___4
|____3

in this case you will get five results :

[b, ,:and:f, , ]
1 2 3 4 5

Just a simple animation to understand more :

How split() method actually works?

Split a String Into Multiple Strings Using a Regex With Different Capture Groups

The java class below will split your example string into the parts you are interested in. The code will split the original string at all '||' and '&&' delimeters, globally. I.e, if you have more than one '||' or '&&' operator in your original string, each part will be split out.

One thing to note is the need to escape (\) the special characters. In Java you also need to escape the escape, so you need 2 backslashes in order to have a literal in your string.

Here's a great site to test out regEx code ... Regular Expressions 101

public class splitStrRegex {

public static void main(String[] args) {

String myStr = ":if[input=right || input=Right && input != null]:";
String[] myStrings = myStr.split("\\|\\||&&");
for(int i = 0; i < myStrings.length; i++) {
System.out.println(myStrings[i].trim());
}
}
}

Output:

:if[input=right
input=Right
input != null]:

Java string split with . (dot)

You need to escape the dot if you want to split on a literal dot:

String extensionRemoved = filename.split("\\.")[0];

Otherwise you are splitting on the regex ., which means "any character".

Note the double backslash needed to create a single backslash in the regex.


You're getting an ArrayIndexOutOfBoundsException because your input string is just a dot, ie ".", which is an edge case that produces an empty array when split on dot; split(regex) removes all trailing blanks from the result, but since splitting a dot on a dot leaves only two blanks, after trailing blanks are removed you're left with an empty array.

To avoid getting an ArrayIndexOutOfBoundsException for this edge case, use the overloaded version of split(regex, limit), which has a second parameter that is the size limit for the resulting array. When limit is negative, the behaviour of removing trailing blanks from the resulting array is disabled:

".".split("\\.", -1) // returns an array of two blanks, ie ["", ""]

ie, when filename is just a dot ".", calling filename.split("\\.", -1)[0] will return a blank, but calling filename.split("\\.")[0] will throw an ArrayIndexOutOfBoundsException.

Splitting string separated with comma with Regex instead of String.split

I'm not sure where you've gone wrong, but in your reply to AlanMoore's helpful suggestion you said you could only match the first word. The code below shows that it matches all the words:

    String s = "o_9,o_8,x_7,o_6,o_5";
Pattern p = Pattern.compile("\\w+");
Matcher m = p.matcher(s);
while (m.find()) {
String matchedWord = m.group();
System.out.println("Matched \"" + matchedWord + "\".");
}

Output:

Matched "o_9".
Matched "o_8".
Matched "x_7".
Matched "o_6".
Matched "o_5".

For performance, the Pattern should be compiled once - preferably as a private static final near the top of your class. Avoid creating an array of strings (as String.split() does) unless you really need to.

Java: Split String around +

Try this:

final String string = "Strg+Q";
final String[] parts = string.split("\\+");
System.out.println(parts[0]); // Strg
System.out.println(parts[1]); // Q

Java RegEx split method

You can use a negative lookahead in your split to only split on whitespace after the :.

String s = "PICK TWO IN ORDER: 10 2 T F";
String[] foo = s.split(":\\s|\\s(?!.*:)");

Output:

PICK TWO IN ORDER
10
2
T
F

use regex in String.Split() for extract text between { and } in java

You are on right track. Just replace the brackets and string.

String n ="${farsiName} - {symbolName}";

Pattern p = Pattern.compile("\\{(.*?)\\}");
Matcher m = p.matcher(n);

while(m.find()) {
System.out.println(m.group(1));
}

java-regex split string after 80 char. if middle of string split to closest space before

It might be hard to articulate a regex pattern which is smart enough to split on whitespace only up to 80 characters at a time. However, if we use a formal regex pattern matcher, it is fairly straightforward:

String myStr = "very long string supposed to exceed eighty characters because some addresses are very long but they have space so should not be an issue to split them";
String pattern = ".{1,80}(?!\\S)";
Pattern r = Pattern.compile(pattern);
Matcher m = r.matcher(myStr);
while (m.find()) {
System.out.println(m.group(0).trim() + " (length: " + m.group(0).trim().length() + ")");
}

This prints:

very long string supposed to exceed eighty characters because some addresses are (length: 80)
very long but they have space so should not be an issue to split them (length: 69)

How do I split a string in Java?

Use the appropriately named method String#split().

String string = "004-034556";
String[] parts = string.split("-");
String part1 = parts[0]; // 004
String part2 = parts[1]; // 034556

Note that split's argument is assumed to be a regular expression, so remember to escape special characters if necessary.

there are 12 characters with special meanings: the backslash \, the caret ^, the dollar sign $, the period or dot ., the vertical bar or pipe symbol |, the question mark ?, the asterisk or star *, the plus sign +, the opening parenthesis (, the closing parenthesis ), and the opening square bracket [, the opening curly brace {, These special characters are often called "metacharacters".

For instance, to split on a period/dot . (which means "any character" in regex), use either backslash \ to escape the individual special character like so split("\\."), or use character class [] to represent literal character(s) like so split("[.]"), or use Pattern#quote() to escape the entire string like so split(Pattern.quote(".")).

String[] parts = string.split(Pattern.quote(".")); // Split on the exact string.

To test beforehand if the string contains certain character(s), just use String#contains().

if (string.contains("-")) {
// Split it.
} else {
throw new IllegalArgumentException("String " + string + " does not contain -");
}

Note, this does not take a regular expression. For that, use String#matches() instead.

If you'd like to retain the split character in the resulting parts, then make use of positive lookaround. In case you want to have the split character to end up in left hand side, use positive lookbehind by prefixing ?<= group on the pattern.

String string = "004-034556";
String[] parts = string.split("(?<=-)");
String part1 = parts[0]; // 004-
String part2 = parts[1]; // 034556

In case you want to have the split character to end up in right hand side, use positive lookahead by prefixing ?= group on the pattern.

String string = "004-034556";
String[] parts = string.split("(?=-)");
String part1 = parts[0]; // 004
String part2 = parts[1]; // -034556

If you'd like to limit the number of resulting parts, then you can supply the desired number as 2nd argument of split() method.

String string = "004-034556-42";
String[] parts = string.split("-", 2);
String part1 = parts[0]; // 004
String part2 = parts[1]; // 034556-42


Related Topics



Leave a reply



Submit