Best Way to Specify Whitespace in a String.Split Operation

Best way to specify whitespace in a String.Split operation

If you just call:

string[] ssize = myStr.Split(null); //Or myStr.Split()

or:

string[] ssize = myStr.Split(new char[0]);

then white-space is assumed to be the splitting character. From the string.Split(char[]) method's documentation page.

If the separator parameter is null or contains no characters, white-space characters are assumed to be the delimiters. White-space characters are defined by the Unicode standard and return true if they are passed to the Char.IsWhiteSpace method.

Always, always, always read the documentation!

C# - Split a string with spaces in between in multiple strings

This will work

string sentence = "Example sentence";
string [] sentenses = sentence.Split(' ');

string one = sentenses[0];
string two = sentenses[1];

Splitting a string at all whitespace

String.Split() (no parameters) does split on all whitespace (including LF/CR)

split string with whitespace AND multiple other operation signs?

You need to understand the regular expression first. For basic use it is very simple :) For your requirement you can split it with String[] split = mystr.split("[-+*/#_^]"); The square brackets provides a list of characters to match, if any one of the character in that square bracket match then it is a match.

Preserve whitespaces when using split() and join() in python

You want to use re.split() in that case, with a group:

re.split(r'(\s+)', line)

would return both the columns and the whitespace so you can rejoin the line later with the same amount of whitespace included.

Example:

>>> re.split(r'(\s+)', line)
['BBP1', ' ', '0.000000', ' ', '-0.150000', ' ', '2.033000', ' ', '0.00', ' ', '-0.150', ' ', '1.77']

You probably do want to remove the newline from the end.

Python: how to split a string based on space but keep '\n'?

Performance wise, you should consider using list comprehensions (as is mentioned in Ursus's answer) with str.split(' ') as:

>>> a = ' girl\n    is'

>>> [word for word in a.split(' ') if word]
['girl\n', 'is']

However if you are interested in functional approach, you may use filter as:

>>> list(filter(bool, a.split(' ')))
['girl\n', 'is']

Here a.split(' ') will split your string based on space as separator and filter (with bool) will filter out your empty strings from the list.


Issue with your code

As the Python's str.split document says:

  • if separator is not passed:

    or is None, a different splitting algorithm is applied: runs of consecutive whitespace are regarded as a single separator, and the result will contain no empty strings at the start or end if the string has leading or trailing whitespace. Consequently, splitting an empty string or a string consisting of just whitespace with a None separator returns [].

  • if separator is passed:

    consecutive delimiters are not grouped together and are deemed to delimit empty strings. Splitting an empty string with a specified separator returns [''].

How to split a String by space

What you have should work. If, however, the spaces provided are defaulting to... something else? You can use the whitespace regex:

str = "Hello I'm your String";
String[] splited = str.split("\\s+");

This will cause any number of consecutive spaces to split your string into tokens.

Does Python's split function splits by a newline or a whitespace by default

If sep is not specified or is None, a different splitting algorithm is
applied: runs of consecutive whitespace are regarded as a single
separator and the result will contain no empty strings at the start
or end if the string has leading or trailing whitespace.

Tabs (\t), newlines (\n), spaces, etc. They all count as whitespace characters as technically they all serve the same purpose. To space things out.

Split string on whitespace in Python

The str.split() method without an argument splits on whitespace:

>>> "many   fancy word \nhello    \thi".split()
['many', 'fancy', 'word', 'hello', 'hi']


Related Topics



Leave a reply



Submit