Split a string by a delimiter in python
You can use the str.split
method: string.split('__')
>>> "MATCHES__STRING".split("__")
['MATCHES', 'STRING']
Split string at delimiter '\' in python
You need to escape the backslash:
S.split('\\')
You may also need to string_escape:
In [10]: s = 'greenland.gdb\topology_check\t_buildings'
In [11]: s.split("\\")
Out[11]: ['greenland.gdb\topology_check\t_buildings']
In [12]: s.encode("string_escape").split("\\")
Out[12]: ['greenland.gdb', 'topology_check', 't_buildings']
\t
would be interpreted as a tab character unless you were using a raw string:
In [18]: s = 'greenland.gdb\topology_check\t_buildings'
In [19]: print(s)
greenland.gdb opology_check _buildings
In [20]: s = r'greenland.gdb\topology_check\t_buildings'
In [21]: print(s)
greenland.gdb\topology_check\t_buildings
Escape characters
How to split a string with many delimiter in python?
For performance, you should use regex as per the marked duplicate. See benchmarking below.
groupby + str.isalnum
You can use itertools.groupby
with str.isalnum
to group by characters which are alphanumeric.
With this solution you do not have to worry about splitting by explicitly specified characters.
from itertools import groupby
x = " has 15 science@and^engineering--departments, affiliated centers, Bandar Abbas&&and Mahshahr."
res = [''.join(j) for i, j in groupby(x, key=str.isalnum) if i]
print(res)
['has', '15', 'science', 'and', 'engineering', 'departments',
'affiliated', 'centers', 'Bandar', 'Abbas', 'and', 'Mahshahr']
Benchmarking vs regex
Some performance benchmarking versus regex solutions (tested on Python 3.6.5):
from itertools import groupby
import re
x = " has 15 science@and^engineering--departments, affiliated centers, Bandar Abbas&&and Mahshahr."
z = x*10000
%timeit [''.join(j) for i, j in groupby(z, key=str.isalnum) if i] # 184 ms
%timeit list(filter(None, re.sub(r'\W+', ',', z).split(','))) # 82.1 ms
%timeit list(filter(None, re.split('\W+', z))) # 63.6 ms
%timeit [_ for _ in re.split(r'\W', z) if _] # 62.9 ms
Python split string with delimiter
One way with regex
:
import re
def findUrlFromString(string):
regex = r"(?i)\b((?:https?://|www\d{0,3}[.]|[a-z0-9.\-]+[.][a-z]{2,4}/)(?:[^\s()<>]+|\(([^\s()<>]+|(\([^\s()<>]+\)))*\))+(?:\(([^\s()<>]+|(\([^\s()<>]+\)))*\)|[^\s`!()\[\]{};:'\".,<>?«»“”‘’]))"
url = re.findall(regex,string)
return [x[0] for x in url]
string = """
- https://site1 # site1
- https://site2 # site2
- https://site3 # site3
- https://site4 # ssite4
"""
print(findUrlFromString(string))
WORKING DEMO: https://rextester.com/LEHDE94008
Another way with list comprehension,
list_of_urls = ['-https://site1#site1', '-https://site2#site2', '-https://site3#site3', '-https://site4#site4']
result = [i.split('#')[0].lstrip('-') for i in list_of_urls]
print(result)
WORKING DEMO: https://rextester.com/VNW41814
Splitting a python string at a delimiter but a specific one
how about something like this:
s = "The cat jumped over the moon very quickly"
l = s.split()
s1 = ' '.join(l[:len(l)//2])
s2 = ' '.join(l[len(l)//2 :])
print(s1)
print(s2)
Split string using a newline delimiter with Python
str.splitlines
method should give you exactly that.
>>> data = """a,b,c
... d,e,f
... g,h,i
... j,k,l"""
>>> data.splitlines()
['a,b,c', 'd,e,f', 'g,h,i', 'j,k,l']
Split Strings into words with multiple word boundary delimiters
A case where regular expressions are justified:
import re
DATA = "Hey, you - what are you doing here!?"
print re.findall(r"[\w']+", DATA)
# Prints ['Hey', 'you', 'what', 'are', 'you', 'doing', 'here']
Splitting on last delimiter in Python string?
Use .rsplit()
or .rpartition()
instead:
s.rsplit(',', 1)
s.rpartition(',')
str.rsplit()
lets you specify how many times to split, while str.rpartition()
only splits once but always returns a fixed number of elements (prefix, delimiter & postfix) and is faster for the single split case.
Demo:
>>> s = "a,b,c,d"
>>> s.rsplit(',', 1)
['a,b,c', 'd']
>>> s.rsplit(',', 2)
['a,b', 'c', 'd']
>>> s.rpartition(',')
('a,b,c', ',', 'd')
Both methods start splitting from the right-hand-side of the string; by giving str.rsplit()
a maximum as the second argument, you get to split just the right-hand-most occurrences.
If you only need the last element, but there is a chance that the delimiter is not present in the input string or is the very last character in the input, use the following expressions:
# last element, or the original if no `,` is present or is the last character
s.rsplit(',', 1)[-1] or s
s.rpartition(',')[-1] or s
If you need the delimiter gone even when it is the last character, I'd use:
def last(string, delimiter):
"""Return the last element from string, after the delimiter
If string ends in the delimiter or the delimiter is absent,
returns the original string without the delimiter.
"""
prefix, delim, last = string.rpartition(delimiter)
return last if (delim and last) else prefix
This uses the fact that string.rpartition()
returns the delimiter as the second argument only if it was present, and an empty string otherwise.
Split string with multiple delimiters in Python
Luckily, Python has this built-in :)
import re
re.split('; |, ', string_to_split)
Update:
Following your comment:
>>> a='Beautiful, is; better*than\nugly'
>>> import re
>>> re.split('; |, |\*|\n',a)
['Beautiful', 'is', 'better', 'than', 'ugly']
Related Topics
Converting Between Datetime, Timestamp and Datetime64
How to Remove an Element from a List by Index
What Is the Python Equivalent of Static Variables Inside a Function
How to Pass a String into Subprocess.Popen (Using the Stdin Argument)
Manually Raising (Throwing) an Exception in Python
How to Check If a String Is a Substring of Items in a List of Strings
How to Count the Frequency of the Elements in an Unordered List
Process Escape Sequences in a String in Python
Best Way to Convert String to Bytes in Python 3
How to Connect to a MySQL Database in Python
Why Is _Init_() Always Called After _New_()
How to Get Line Count of a Large File Cheaply in Python
Understanding the "Is" Operator