How to split a string, but also keep the delimiters?
You can use lookahead and lookbehind, which are features of regular expressions.
System.out.println(Arrays.toString("a;b;c;d".split("(?<=;)")));
System.out.println(Arrays.toString("a;b;c;d".split("(?=;)")));
System.out.println(Arrays.toString("a;b;c;d".split("((?<=;)|(?=;))")));
And you will get:
[a;, b;, c;, d]
[a, ;b, ;c, ;d]
[a, ;, b, ;, c, ;, d]
The last one is what you want.
((?<=;)|(?=;))
equals to select an empty character before ;
or after ;
.
EDIT: Fabian Steeg's comments on readability is valid. Readability is always a problem with regular expressions. One thing I do to make regular expressions more readable is to create a variable, the name of which represents what the regular expression does. You can even put placeholders (e.g. %1$s
) and use Java's String.format
to replace the placeholders with the actual string you need to use; for example:
static public final String WITH_DELIMITER = "((?<=%1$s)|(?=%1$s))";
public void someMethod() {
final String[] aEach = "a;b;c;d".split(String.format(WITH_DELIMITER, ";"));
...
}
regex how to split String without removing that seperator and adding it in seperate way
You need to use look ahead and look behind like following
String string1="Ram-sita-laxman";
System.out.println(Arrays.toString(string1.split("((?<=-)|(?=-))")));
In this the output will be[Ram, -, sita, -, laxman]
Notice that while the delimiter is there but not everything is in quotes cause it can't be unless you add them yourself in the array
Hope this helps.
Java: Split String with Regex without deleting delimiters
String s="Hi, <name> pls visit <url>";
String[] ss = s.split("(?<=> )|(?=<)");
System.out.println(Arrays.toString(ss));
the above codes output:
[Hi, , <name> , pls visit , <url>]
Java string split with . (dot)
You need to escape the dot if you want to split on a literal dot:
String extensionRemoved = filename.split("\\.")[0];
Otherwise you are splitting on the regex .
, which means "any character".
Note the double backslash needed to create a single backslash in the regex.
You're getting an ArrayIndexOutOfBoundsException
because your input string is just a dot, ie "."
, which is an edge case that produces an empty array when split on dot; split(regex)
removes all trailing blanks from the result, but since splitting a dot on a dot leaves only two blanks, after trailing blanks are removed you're left with an empty array.
To avoid getting an ArrayIndexOutOfBoundsException
for this edge case, use the overloaded version of split(regex, limit)
, which has a second parameter that is the size limit for the resulting array. When limit
is negative, the behaviour of removing trailing blanks from the resulting array is disabled:
".".split("\\.", -1) // returns an array of two blanks, ie ["", ""]
ie, when filename
is just a dot "."
, calling filename.split("\\.", -1)[0]
will return a blank, but calling filename.split("\\.")[0]
will throw an ArrayIndexOutOfBoundsException
.
How can I split a string in Java and retain the delimiters?
str.split("(?=[:;])")
This will give you the desired array, only with an empty first item. And:
str.split("(?=\\b[:;])")
This will give the array without the empty first item.
- The key here is the
(?=X)
which is a zero-width positive lookahead (non-capturing construct) (see regex pattern docs). [:;]
means "either ; or :"\b
is word-boundary - it's there in order not to consider the first:
as delimiter (since it is the beginning of the sequence)
Use String.split() with multiple delimiters
I think you need to include the regex OR operator:
String[]tokens = pdfName.split("-|\\.");
What you have will match:
[DASH followed by DOT together] -.
not
[DASH or DOT any of them] -
or .
JS string.split() without removing the delimiters
Try this:
- Replace all of the "d" instances into ",d"
- Split by ","
var string = "abcdeabcde";
var newstringreplaced = string.replace(/d/gi, ",d");
var newstring = newstringreplaced.split(",");
return newstring;
Hope this helps.
Splitting a string by a space without removing the space?
You can use a regex, although it probably is an overkill :
StringCollection resultList = new StringCollection();
Regex regexObj = new Regex(@"(?:\b\w+\b|\s)");
Match matchResult = regexObj.Match(subjectString);
while (matchResult.Success) {
resultList.Add(matchResult.Value);
matchResult = matchResult.NextMatch();
}
Splitting on regex without removing delimiters
You can use re.findall
with regex .*?[.!\?]
; the lazy quantifier *?
makes sure each pattern matches up to the specific delimiter you want to match on:
import re
s = """You! Are you Tom? I am Danny."""
re.findall('.*?[.!\?]', s)
# ['You!', ' Are you Tom?', ' I am Danny.']
Related Topics
Differencebetween 'Java', 'Javaw', and 'Javaws'
How to Add a Maven Dependency in Eclipse
What Is More Efficient: System.Arraycopy or Arrays.Copyof
Filter Values Only If Not Null Using Lambda in Java8
How to Redirect to Login Page When Session Is Expired in Java Web Application
Is Httpsession Thread Safe, Are Set/Get Attribute Thread Safe Operations
Compile-Time Constants and Variables
How to Get the Caller Class in Java
How to Remove a Cookie in a Java Servlet
Convert an Integer to an Array of Digits
How Can a Class Have a Member of Its Own Type, Isn't This Infinite Recursion
Embed a Jre in a Windows Executable
Convert from List<Completablefuture> to Completablefuture<List>
Why Does String.Valueof(Null) Throw a Nullpointerexception
How to Add Checkboxes to Jtable Swing