In Python, how do I split a string and keep the separators?
>>> re.split('(\W)', 'foo/bar spam\neggs')
['foo', '/', 'bar', ' ', 'spam', '\n', 'eggs']
Split String with multiple delimiters and keep delimiters
Try with parenthesis:
>>> split_str = re.split("(and | or | & | /)", input_str)
>>> split_str
['X < -500', ' & ', 'Y > 3000', ' /', ' Z > 50']
>>>
If you want to remove extra spaces:
>>> split_str = [i.strip() for i in re.split("(and | or | & | /)", input_str)]
>>> split_str
['X < -500', '&', 'Y > 3000', '/', ' Z > 50']
>>>
Python: Split string without losing split character
If you want to do this in a single line:
string = "HELLO.WORLD.AGAIN."
pattern = "."
result = string.replace(pattern, f" {pattern} ").split(" ")
# if you want to omit the last element because of the punctuation at the end of the string uncomment this
# result = result[:-1]
Python RE library String Split but keep the delimiters/separators as part of the next string
If you are using python 3.7+ you can split by zero-length matches using re.split
and positive lookahead:
string = 'a+0b-2a+b-b'
re.split(r'(?=[+-])', string)
# ['a', '+0b', '-2a', '+b', '-b']
Demo: https://regex101.com/r/AB6UBa/1
How do I split a string and keep the separators using python re library?
You can use re.findall
to capture each parenthesis group:
import re
string = r"('Option A' | 'Option B') & ('Option C' | 'Option D')"
pattern = r"(\([^\)]+\))"
re.findall(pattern, string)
# ["('Option A' | 'Option B')", "('Option C' | 'Option D')"]
This also works with re.split
re.split(pattern, string)
# ['', "('Option A' | 'Option B')", ' & ', "('Option C' | 'Option D')", '']
If you want to remove empty elements from using re.split
you can:
[s for s in re.split(pattern, string) if s]
# ["('Option A' | 'Option B')", ' & ', "('Option C' | 'Option D')"]
How the pattern
works:
(
begin capture group\(
matches the character(
literally[^\)]+
Match between one and unlimited characters that are not)
\)
matches the character)
literally)
end capture group
PYTHON Split String at space but keep the spaces
I wonder why you need it, but it can be done like so
import re
a = 'bla bla bla bla'
temp = re.sub(' ','\t \t',a)
result = temp.split('\t')
Split a string by regex and keep the seperator AS A PART OF ITEMS in python
That happened because you used re.split
that keeps the chunks captured in the resulting list as separate items.
Your regex makes sense only if your matches can span several lines, else, extracting any line that starts with a time-like pattern would be enough.
That is why I'd suggest
regex = r"\b\d+/\d+/\d.*?(?=\s*\b\d+/\d+/\d+|$)"
results = re.findall(regex, chat, re.S)
See the Python demo:
import re
chat = '''27/01/2019, 08:58 - Member 01 created group "Python Lovers ❤️"
27/01/2019, 08:58 - You were added
19/03/2019, 19:29 - Member 02: Hello guys,,,
19/03/2019, 19:29 - Member 03: Hi there..'''
regex = r"\b\d+/\d+/\d.*?(?=\s*\b\d+/\d+/\d+|$)"
results = re.findall(regex, chat, re.S)
for r in results:
print(r)
Output:
27/01/2019, 08:58 - Member 01 created group "Python Lovers ❤️"
27/01/2019, 08:58 - You were added
19/03/2019, 19:29 - Member 02: Hello guys,,,
19/03/2019, 19:29 - Member 03: Hi there..
Note the absence of the redundant capturing group and no *
after the positive lookahead that made it optional. Whitespaces at the end of each match are stripped using \s*
pattern inside the lookahead.
The re.S
flag allows .
to match any char including line break chars.
Split string into 2 columns, but keep the separator
you can add the ()
to keep the separators, for example:
df['column1'].str.split('(sep1|sep2|sep3)')
How would I split a python string and keep the separator, but the separator isn't a separate list item?
Use if condition to check if the length of a string is greater than 1 or not, and only concatenate when the length is greater than 1.
split = [_ + str(char) for _ in split if len(_)>0]
Related Topics
How to Repeatedly Execute a Function Every X Seconds
How to Display Last 2 Digits from a Number in Python
How to Select All Elements Greater Than a Given Values in a Dataframe
Delete Rows Containing Numeric Values in Strings from Pandas Dataframe
Print All Number Divisible by 7 and Contain 7 from 0 to 100
How to Convert Python Code to Application
Pandas Dataframe Calculations With Previous Row
How to Count the Total Number of Words in a Pandas Dataframe Cell and Add Those to a New Column
How to Delete the Words Between Two Delimiters
Python Pandas Dataframe Get All Combinations of Column Values
How to Convert Datetime by Removing Nanoseconds
Django: Check Whether an Object Already Exists Before Adding
How to Get All Users in a Telegram Channel Using Telethon
Passing a List of Values from Python to the in Clause of an SQL Query
How to Make Python Code to Execute Only Once
Reduce Multi-Index/Multi-Level Dataframe to Single Index, Single Level