Split by Comma and Strip Whitespace in Python

Split by comma and strip whitespace in Python

Use list comprehension -- simpler, and just as easy to read as a for loop.

my_string = "blah, lots  ,  of ,  spaces, here "
result = [x.strip() for x in my_string.split(',')]
# result is ["blah", "lots", "of", "spaces", "here"]

See: Python docs on List Comprehension

A good 2 second explanation of list comprehension.

Split vs Strip in Python to remove redundant white space

According to the documentation:

If sep is not specified or is None, a different splitting algorithm is applied: runs of consecutive whitespace are regarded as a single separator, and the result will contain no empty strings at the start or end if the string has leading or trailing whitespace.

Which means, that the logic of strip() is already included into split(), so I think, your teacher is wrong. (Notice, that this will change in case if you're using a non-default separator.)

Splitting string and removing whitespace Python

Python has a spectacular function called split that will keep you from having to use a regex or something similar. You can split your string by just calling my_string.split(delimiter)

After that python has a strip function which will remove all whitespace from the beginning and end of a string.

[item.strip() for item in my_string.split(',')]

Benchmarks for the two methods are below:

>>> import timeit
>>> timeit.timeit('map(str.strip, "QVOD, Baidu Player".split(","))', number=100000)
0.3525350093841553
>>> timeit.timeit('map(stripper, "QVOD, Baidu Player".split(","))','stripper=str.strip', number=100000)
0.31575989723205566
>>> timeit.timeit("[item.strip() for item in 'QVOD, Baidu Player'.split(',')]", number=100000)
0.246596097946167

So the list comp is about 33% faster than the map.

Probably also worth noting that as far as being "pythonic" goes, Guido himself votes for the LC. http://www.artima.com/weblogs/viewpost.jsp?thread=98196

Split text only by comma or (comma and space) not only with space

Given:

>>> s="John Doe, Jack, , Henry,Harry,,Rob"

You can do:

>>> [e for e in re.split(r'\s*,\s*',s) if e]
['John Doe', 'Jack', 'Henry', 'Harry', 'Rob']

(Python) how to remove whitespace after comma in text file

Just split by (", ") instead of (",")

[i.split(', ') for i in f]

Python Split on Space Except Between Words & After Commas

You can use this regex, which looks for a space which is not preceded by a letter or comma, or is not followed by a letter:

(?<![a-z,]) | (?![a-z])

Demo on regex101

In python:

import re
a = "11/27/2019 Sold $900,000 -6.2% Suzanne Freeze-Manning, Kevin Garvey"
b = "11/2/2019 Pending sale $959,000"

print(re.split(r'(?<![a-z,]) | (?![a-z])', a, 0, re.IGNORECASE))
print(re.split(r'(?<![a-z,]) | (?![a-z])', b, 0, re.IGNORECASE))

Output:

['11/27/2019', 'Sold', '$900,000', '-6.2%', 'Suzanne Freeze-Manning, Kevin Garvey']
['11/2/2019', 'Pending sale', '$959,000']

python re split string by commas and space

You can use findall and match what you want:

>>> print re.findall(r'[^,\s]+', '    5,    3,   , hello')
['5', '3', 'hello']

[^,\s]+ is using a negated character class to match any text that is not a comma and not a whitespace.


Your split regex ,|\s+ is splitting at multiple positions since , is surrounded by whitespaces as well.

As your input has leading whitespaces even splitting on [,\s]+ will give an empty element at the start.

>>> print re.split(r'[,\s]+', '    5,    3,   , hello')
['', '5', '3', 'hello']

python split string on whitespace

I see that you have several \t sometimes. I'd use the re module to split correctly:

for line in lines:
linedata = re.split(r'\t+', line)
print ",".join(linedata)


Related Topics



Leave a reply



Submit