Split Strings into words with multiple word boundary delimiters
A case where regular expressions are justified:
import re
DATA = "Hey, you - what are you doing here!?"
print re.findall(r"[\w']+", DATA)
# Prints ['Hey', 'you', 'what', 'are', 'you', 'doing', 'here']
Split string with multiple delimiters in Python
Luckily, Python has this built-in :)
import re
re.split('; |, ', string_to_split)
Update:
Following your comment:
>>> a='Beautiful, is; better*than\nugly'
>>> import re
>>> re.split('; |, |\*|\n',a)
['Beautiful', 'is', 'better', 'than', 'ugly']
How to split a string with multiple delimiters using string.split()?
This code takes each string
element of the list
and replaces at
with |
and then it splits by |
and then assigns in-place the sub-list of the resulting strings.
Side-note: Don't use list
as a variable name, since it is a language built-in keyword.
lis = ['Sep 10, 2020 at 17:36 | Kate', 'Sep 10, 2020 at 17:13 | Charles']
lis = [string.replace(" at ", " | ").split(" | ") for string in lis]
print(lis)
Output:
[['Sep 10, 2020', '17:36', 'Kate'], ['Sep 10, 2020', '17:13', 'Charles']]
Split String with multiple delimiters and keep delimiters
Try with parenthesis:
>>> split_str = re.split("(and | or | & | /)", input_str)
>>> split_str
['X < -500', ' & ', 'Y > 3000', ' /', ' Z > 50']
>>>
If you want to remove extra spaces:
>>> split_str = [i.strip() for i in re.split("(and | or | & | /)", input_str)]
>>> split_str
['X < -500', '&', 'Y > 3000', '/', ' Z > 50']
>>>
Splitting strings using multiple delimiters- in Python. Getting TypeError: expected string or bytes-like object
re is a library that recieves a String type, not a Pandas dataframe column you should use an accessor in this case
df[['A']] = df['Sport'].str.split(r';,')
I hope it resolves your problem
Split Strings into words with multiple word boundary delimiters
A case where regular expressions are justified:
import re
DATA = "Hey, you - what are you doing here!?"
print re.findall(r"[\w']+", DATA)
# Prints ['Hey', 'you', 'what', 'are', 'you', 'doing', 'here']
How to split a string by multiple punctuations with Python?
You can use regex
to achieve this as:
>>> import re
>>> s = 'a,b,c d!e.f\ngood\tmorning&night'
>>> re.split('[?.,\n\t&! ]', s)
['a', 'b', 'c', 'd', 'e', 'f', 'good', 'morning', 'night']
If you are looking for a solution using split()
, then here's a workaround:
>>> identifiers = '!"#$%&\'()*+,-./:;<=>?@[\\]^_`{|}~\n\t '
>>> "".join((' ' if c in identifiers else c for c in s)).split()
['a', 'b', 'c', 'd', 'e', 'f', 'good', 'morning', 'night']
Here, I am replacing all the identifiers with a space " "
in the string, and then splitting the string based on the space.
How do I split a string with multiple word delimiters in Python?
You could use re
like,
Updated using the better way suggested by @pault using word boundaries \b
instead of :space:
,
>>> import re
>>> words = ['hello world', 'hello my name is jolloopp', 'my jolloopp name is hello']
# Iterate over the list of words and then use the `re` to split the strings,
>>> [z for y in (re.split('|'.join(r'\b{}\b'.format(x) for x in splitters), word) for word in words) for z in y]
['hello world', 'hello ', ' name ', ' jolloopp', '', ' jolloopp name ', ' hello']
Related Topics
Store Large Data or a Service Connection Per Flask Session
Integer Division by Negative Number
Why Do Some Regex Engines Match .* Twice in a Single Input String
Regexp Finding Longest Common Prefix of Two Strings
How to Make Good Reproducible Pandas Examples
How to Pass a Variable by Reference
Using Global Variables in a Function
Why Does "A == X or Y or Z" Always Evaluate to True
How to Select Rows from a Dataframe Based on Column Values
After Anaconda Installation, Conda Command Fails With "Importerror: No Module Named Conda.Cli"
Force Another Program'S Standard Output to Be Unbuffered Using Python
How to Control the Source Ip Address of a Zeromq Packet on a Machine With Multiple Ips