How to Use "." as the Delimiter with String.Split() in Java

How to split a string, but also keep the delimiters?

You can use lookahead and lookbehind, which are features of regular expressions.

System.out.println(Arrays.toString("a;b;c;d".split("(?<=;)")));
System.out.println(Arrays.toString("a;b;c;d".split("(?=;)")));
System.out.println(Arrays.toString("a;b;c;d".split("((?<=;)|(?=;))")));

And you will get:

[a;, b;, c;, d]
[a, ;b, ;c, ;d]
[a, ;, b, ;, c, ;, d]

The last one is what you want.

((?<=;)|(?=;)) equals to select an empty character before ; or after ;.

EDIT: Fabian Steeg's comments on readability is valid. Readability is always a problem with regular expressions. One thing I do to make regular expressions more readable is to create a variable, the name of which represents what the regular expression does. You can even put placeholders (e.g. %1$s) and use Java's String.format to replace the placeholders with the actual string you need to use; for example:

static public final String WITH_DELIMITER = "((?<=%1$s)|(?=%1$s))";

public void someMethod() {
final String[] aEach = "a;b;c;d".split(String.format(WITH_DELIMITER, ";"));
...
}

How do I split a string in Java?

Use the appropriately named method String#split().

String string = "004-034556";
String[] parts = string.split("-");
String part1 = parts[0]; // 004
String part2 = parts[1]; // 034556

Note that split's argument is assumed to be a regular expression, so remember to escape special characters if necessary.

there are 12 characters with special meanings: the backslash \, the caret ^, the dollar sign $, the period or dot ., the vertical bar or pipe symbol |, the question mark ?, the asterisk or star *, the plus sign +, the opening parenthesis (, the closing parenthesis ), and the opening square bracket [, the opening curly brace {, These special characters are often called "metacharacters".

For instance, to split on a period/dot . (which means "any character" in regex), use either backslash \ to escape the individual special character like so split("\\."), or use character class [] to represent literal character(s) like so split("[.]"), or use Pattern#quote() to escape the entire string like so split(Pattern.quote(".")).

String[] parts = string.split(Pattern.quote(".")); // Split on the exact string.

To test beforehand if the string contains certain character(s), just use String#contains().

if (string.contains("-")) {
// Split it.
} else {
throw new IllegalArgumentException("String " + string + " does not contain -");
}

Note, this does not take a regular expression. For that, use String#matches() instead.

If you'd like to retain the split character in the resulting parts, then make use of positive lookaround. In case you want to have the split character to end up in left hand side, use positive lookbehind by prefixing ?<= group on the pattern.

String string = "004-034556";
String[] parts = string.split("(?<=-)");
String part1 = parts[0]; // 004-
String part2 = parts[1]; // 034556

In case you want to have the split character to end up in right hand side, use positive lookahead by prefixing ?= group on the pattern.

String string = "004-034556";
String[] parts = string.split("(?=-)");
String part1 = parts[0]; // 004
String part2 = parts[1]; // -034556

If you'd like to limit the number of resulting parts, then you can supply the desired number as 2nd argument of split() method.

String string = "004-034556-42";
String[] parts = string.split("-", 2);
String part1 = parts[0]; // 004
String part2 = parts[1]; // 034556-42

String delimiter in string.split method

There is no need to set the delimiter by breaking it up in pieces like you have done.

Here is a complete program you can compile and run:

import java.util.Arrays;
public class SplitExample {
public static final String PLAYER = "1||1||Abdul-Jabbar||Karim||1996||1974";
public static void main(String[] args) {
String[] data = PLAYER.split("\\|\\|");
System.out.println(Arrays.toString(data));
}
}

If you want to use split with a pattern, you can use Pattern.compile or Pattern.quote.

To see compile and quote in action, here is an example using all three approaches:

import java.util.Arrays;
import java.util.regex.Pattern;
public class SplitExample {
public static final String PLAYER = "1||1||Abdul-Jabbar||Karim||1996||1974";
public static void main(String[] args) {
String[] data = PLAYER.split("\\|\\|");
System.out.println(Arrays.toString(data));

Pattern pattern = Pattern.compile("\\|\\|");
data = pattern.split(PLAYER);
System.out.println(Arrays.toString(data));

pattern = Pattern.compile(Pattern.quote("||"));
data = pattern.split(PLAYER);
System.out.println(Arrays.toString(data));
}
}

The use of patterns is recommended if you are going to split often using the same pattern. BTW the output is:

[1, 1, Abdul-Jabbar, Karim, 1996, 1974]
[1, 1, Abdul-Jabbar, Karim, 1996, 1974]
[1, 1, Abdul-Jabbar, Karim, 1996, 1974]

Split string with | separator in java

| is treated as an OR in RegEx. So you need to escape it:

String[] separated = line.split("\\|");

Java string split with . (dot)

You need to escape the dot if you want to split on a literal dot:

String extensionRemoved = filename.split("\\.")[0];

Otherwise you are splitting on the regex ., which means "any character".

Note the double backslash needed to create a single backslash in the regex.


You're getting an ArrayIndexOutOfBoundsException because your input string is just a dot, ie ".", which is an edge case that produces an empty array when split on dot; split(regex) removes all trailing blanks from the result, but since splitting a dot on a dot leaves only two blanks, after trailing blanks are removed you're left with an empty array.

To avoid getting an ArrayIndexOutOfBoundsException for this edge case, use the overloaded version of split(regex, limit), which has a second parameter that is the size limit for the resulting array. When limit is negative, the behaviour of removing trailing blanks from the resulting array is disabled:

".".split("\\.", -1) // returns an array of two blanks, ie ["", ""]

ie, when filename is just a dot ".", calling filename.split("\\.", -1)[0] will return a blank, but calling filename.split("\\.")[0] will throw an ArrayIndexOutOfBoundsException.

Use String.split() with multiple delimiters

I think you need to include the regex OR operator:

String[]tokens = pdfName.split("-|\\.");

What you have will match:

[DASH followed by DOT together] -.
not

[DASH or DOT any of them] - or .

Why can't I use . as a delimiter in split() function?

The split function uses regular expressions, you have to escape your "." with a "\"

When using regular expressions a "." means any character. Try this

String delimiter = "\\.x";

It should also be mentioned that \ in java is also a special character used to create other special characters. Therefore you have to escape your \ with another \ hence the "\\.x"


Theres some great documentation in the Java docs about all the special characters and what they do:

Java 8 Docs

Java 7 Docs

Java 6 Docs

Java String Split with multiple delimiter using pipe '|'

You can use the given link for understanding of How Delimiters Works.

How do I use a delimiter in Java Scanner?

Another alternative Way

You can use useDelimiter(String pattern) method of Scanner class. The use of useDelimiter(String pattern) method of Scanner class. Basically we have used the String semicolon(;) to tokenize the String declared on the constructor of Scanner object.

There are three possible token on the String “Anne Mills/Female/18″ which is name,gender and age. The scanner class is used to split the String and output the tokens in the console.

import java.util.Scanner;

/*
* This is a java example source code that shows how to use useDelimiter(String pattern)
* method of Scanner class. We use the string ; as delimiter
* to use in tokenizing a String input declared in Scanner constructor
*/

public class ScannerUseDelimiterDemo {

public static void main(String[] args) {

// Initialize Scanner object
Scanner scan = new Scanner("Anna Mills/Female/18");
// initialize the string delimiter
scan.useDelimiter("/");
// Printing the delimiter used
System.out.println("The delimiter use is "+scan.delimiter());
// Printing the tokenized Strings
while(scan.hasNext()){
System.out.println(scan.next());
}
// closing the scanner stream
scan.close();

}
}

Java splitting string using delimiter and store to different array

Here is one approach which splits in the input on the following regex pattern:

_(?!.*_)

This splits the input string on only the last underscore character. We can try iterating your collection of inputs, and then populating the two arrays.

List<String> inputs = Arrays.asList(new String[] {"1564095_SINGLE_true", "1564096_SINGLE_true"});
String[] arrayA = new String[2];
String[] arrayB = new String[2];
int index = 0;
for (String input : inputs) {
arrayA[index] = input.split("_(?!.*_)")[0];
arrayB[index] = input.split("_(?!.*_)")[1];
++index;
}
System.out.println("A[]: " + Arrays.toString(arrayA));
System.out.println("B[]: " + Arrays.toString(arrayB));

This prints:

A[]: [1564095_SINGLE, 1564096_SINGLE]
B[]: [true, true]


Related Topics



Leave a reply



Submit