Java String split removed empty values
split(delimiter)
by default removes trailing empty strings from result array. To turn this mechanism off we need to use overloaded version of split(delimiter, limit)
with limit
set to negative value like
String[] split = data.split("\\|", -1);
Little more details:split(regex)
internally returns result of split(regex, 0)
and in documentation of this method you can find (emphasis mine)
The
limit
parameter controls the number of times the pattern is applied and therefore affects the length of the resulting array.If the limit
n
is greater than zero then the pattern will be applied at most n - 1 times, the array's length will be no greater than n, and the array's last entry will contain all input beyond the last matched delimiter.If
n
is non-positive then the pattern will be applied as many times as possible and the array can have any length.If
n
is zero then the pattern will be applied as many times as possible, the array can have any length, and trailing empty strings will be discarded.
Exception:
It is worth mentioning that removing trailing empty string makes sense only if such empty strings were created by the split mechanism. So for "".split(anything)
since we can't split ""
farther we will get as result [""]
array.
It happens because split didn't happen here, so ""
despite being empty and trailing represents original string, not empty string which was created by splitting process.
how to prevent split from removing empty elements
Split with a negative limit will preserve trailing empty fields.
@fields = split(/,/, "a,,", -1);
How to split string with trailing empty strings in result?
As Peter mentioned in his answer, "string".split()
, in both Java and Scala, does not return trailing empty strings by default.
You can, however, specify for it to return trailing empty strings by passing in a second parameter, like this:
String s = "elem1,elem2,,";
String[] tokens = s.split(",", -1);
And that will get you the expected result.
You can find the related Java doc here.
How to remove falsy values when splitting a string with a non-whitespace separator
If you want to be obtuse, you could use filter(None, x)
to remove falsey items:
>>> list(filter(None, '1,2,,3,'.split(',')))
['1', '2', '3']
Probably less Pythonic. It might be clearer to iterate over the items specifically:
for w in '1,2,,3,'.split(','):
if w:
…
This makes it clear that you're skipping the empty items and not relying on the fact that str.split sometimes skips empty items.
I'd just as soon use a regex, either to skip consecutive runs of the separator (but watch out for the end):
>>> re.split(r',+', '1,2,,3,')
['1', '2', '3', '']
or to find everything that's not a separator:
>>> re.findall(r'[^,]+', '1,2,,3,')
['1', '2', '3']
If you want to go way back in Python's history, there were two separate functions, split
and splitfields
. I think the name explains the purpose. The first splits on any whitespace, useful for arbitrary text input, and the second behaves predictably on some delimited input. They were implemented in pure Python before v1.6.
Split vs Strip in Python to remove redundant white space
According to the documentation:
If sep is not specified or is
None
, a different splitting algorithm is applied: runs of consecutive whitespace are regarded as a single separator, and the result will contain no empty strings at the start or end if the string has leading or trailing whitespace.
Which means, that the logic of strip()
is already included into split()
, so I think, your teacher is wrong. (Notice, that this will change in case if you're using a non-default separator.)
How to remove empty trailing values and Carriage Return in JS Array?
I don't think there's any magic bullet here, just a loop checking for the values you want to remove, directly or with a regular expression. For instance, to remove blank strings and "\r"
:
while (array.length) { // Loop while there are still entries
const last = array[array.length - 1]; // Get the last entry without removing it
if (last !== "" && last !== "\r") { // Is this one to remove?
break; // No, stop
}
--array.length; // Yes, remove and keep looping
}
Live Example:
const array = ['', 'Apple', '', 'Banana', '', 'Guava', '', '', '', '\r'];
while (array.length) { // Loop while there are still entries
const last = array[array.length - 1]; // Get the last entry without removing it
if (last !== "" && last !== "\r") { // Is this one to remove?
break; // No, stop
}
--array.length; // Yes, remove and keep looping
}
console.log(array);
Empty strings at the beginning and end of split
After reading AWK's specification following mu is too short, I came to feel that the original intention for split
in AWK was to extract substrings that correspond to fields, each of which is terminated by a punctuation mark like ,
, .
, and the separator was considered something like an "end of field character". The intention was not splitting a string symmetrically into the left and the right side of each separator position, but was terminating a substring on the left side of a separator position. Under this conception, it makes sense to always have some string (even if it is empty) on the left of the separator, but not necessarily on the right side of the separator. This may have been inherited to Ruby via Perl.
Why are empty strings returned in split() results?
str.split
complements str.join
, so
"/".join(['', 'segment', 'segment', ''])
gets you back the original string.
If the empty strings were not there, the first and last '/'
would be missing after the join()
.
Related Topics
Ruby Gsub Doesn't Escape Single-Quotes
How to Install MySQL2 Gem on Windows 7
Can't Access Rubygems - Possibly Due to Ssl
Read and Write Yaml Files Without Destroying Anchors and Aliases
Ruby: Class Instance Variables VS Instance Variables
Setspeed in Selenium Webdriver Using Ruby
What's the Precedence of Ruby'S Method Call
How to Avoid Trailing Empty Items Being Removed When Splitting Strings
Google Plus API Shutdown Today, Which Alternative Can Be Used to Authentication
How to Add an Array to Another Array in Ruby and Not End Up With a Multi-Dimensional Result
How to Do Relative Time in Rails
Ruby: What Does the Comment "Frozen_String_Literal: True" Do
How to Understand the Difference Between Class_Eval() and Instance_Eval()
Paperclip::Errors::Missingrequired
validatorerror With Rails 4