Splitting on last delimiter in Python string?
Use .rsplit()
or .rpartition()
instead:
s.rsplit(',', 1)
s.rpartition(',')
str.rsplit()
lets you specify how many times to split, while str.rpartition()
only splits once but always returns a fixed number of elements (prefix, delimiter & postfix) and is faster for the single split case.
Demo:
>>> s = "a,b,c,d"
>>> s.rsplit(',', 1)
['a,b,c', 'd']
>>> s.rsplit(',', 2)
['a,b', 'c', 'd']
>>> s.rpartition(',')
('a,b,c', ',', 'd')
Both methods start splitting from the right-hand-side of the string; by giving str.rsplit()
a maximum as the second argument, you get to split just the right-hand-most occurrences.
If you only need the last element, but there is a chance that the delimiter is not present in the input string or is the very last character in the input, use the following expressions:
# last element, or the original if no `,` is present or is the last character
s.rsplit(',', 1)[-1] or s
s.rpartition(',')[-1] or s
If you need the delimiter gone even when it is the last character, I'd use:
def last(string, delimiter):
"""Return the last element from string, after the delimiter
If string ends in the delimiter or the delimiter is absent,
returns the original string without the delimiter.
"""
prefix, delim, last = string.rpartition(delimiter)
return last if (delim and last) else prefix
This uses the fact that string.rpartition()
returns the delimiter as the second argument only if it was present, and an empty string otherwise.
split string in to 2 based on last occurrence of a separator
Use rpartition(s)
. It does exactly that.
You can also use rsplit(s, 1)
.
pandas split by last delimiter
With Series.str.rsplit
, limiting the number of splits.
df.col1.str.rsplit('|', 1, expand=True).rename(lambda x: f'col{x + 1}', axis=1)
If the above throws you a SyntaxError, it means you're on a python version older than 3.6 (shame on you!). Use instead
df.col1.str.rsplit('|', 1, expand=True)\
.rename(columns=lambda x: 'col{}'.format(x + 1))
col1 col2
0 MLB|NBA NFL
1 MLB NBA
2 NFL|NHL|NBA MLB
There's also the faster loopy str.rsplit
equivalent.
pd.DataFrame(
[x.rsplit('|', 1) for x in df.col1.tolist()],
columns=['col1', 'col2']
)
col1 col2
0 MLB|NBA NFL
1 MLB NBA
2 NFL|NHL|NBA MLB
P.S., yes, the second solution is faster:
df = pd.concat([df] * 100000, ignore_index=True)
%timeit df.col1.str.rsplit('|', 1, expand=True)
%timeit pd.DataFrame([x.rsplit('|', 1) for x in df.col1.tolist()])
473 ms ± 13.8 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
128 ms ± 1.29 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
Python split at last dot before x characters
str = "Lorem ipsum dolor sit amet, consetetur sadipscing elitr. sed diam nonumy eirmod tempor. invidunt ut labore et dolore mgna aliquyam erat. sed diam voluptua. At vero eos et accusam et justo duo dolores."
n = 100
str = str.rstrip(".")
chunks = [str[i:i+n] + "." for i in range(0, len(str), n)]
print(chunks)
output:
['Lorem ipsum dolor sit amet, consetetur sadipscing elitr. sed diam nonumy eirmod tempor. invidunt ut .', 'labore et dolore mgna aliquyam erat. sed diam voluptua. At vero eos et accusam et justo duo dolores.']
Python - Get Last Element after str.split()
Use a list comprehension to take the last element of each of the split strings:
ids = [val[-1] for val in your_string.split()]
Read lines from file and split line on last delimiter in Python and save it to another file
You could try using the split
method on each line of data
file_save = open("adresses.txt", 'a')
with open('data.txt', 'r') as file_to_open:
data = file_to_open.readlines();
for address in data:
file_save.write(address.split(":")[-1])
file_to_open.close()
partition string in python and get value of last segment after colon
result = mystring.rpartition(':')[2]
If you string does not have any :
, the result will contain the original string.
An alternative that is supposed to be a little bit slower is:
result = mystring.split(':')[-1]
Splitting and then removing last character in comprehension
Try the below
text = "firstX secondY thirdZ"
text_lst = [x[:-1] for x in text.split(' ')]
print(text_lst)
output
['first', 'second', 'third']
Pandas: Split string on last occurrence
I think need indexing by str working with iterables:
#select last lists
df_client["Subject"].str.rsplit("-", 1).str[-1]
#select second lists
df_client["Subject"].str.rsplit("-", 1).str[1]
If performance is important use list comprehension
:
df_client['last_col'] = [x.rsplit("-", 1)[-1] for x in df_client["Subject"]]
print (df_client)
Subject last_col
0 Activity-Location-UserCode UserCode
1 Activity-Location-UserCode UserCode
Related Topics
Stopping a Thread After a Certain Amount of Time
Target Wsgi Script Cannot Be Loaded as Python Module
How to Bind the Enter Key to a Function in Tkinter
Why Is Using Thread Locals in Django Bad
Generating File to Download with Django
Remove a Tag Using Beautifulsoup But Keep Its Contents
Convert Timedelta to Total Seconds
Pandas Dataframe Fillna() Only Some Columns in Place
Pythonic Way to Create Union of All Values Contained in Multiple Lists
How to Install Pip for Python 3 on MAC Os X
Difference Between the Built-In Pow() and Math.Pow() for Floats, in Python
How to Force a List to a Fixed Size
Type Object 'Datetime.Datetime' Has No Attribute 'Datetime'
Unicodedecodeerror: 'Ascii' Codec Can't Decode Byte 0Xe2 in Position 13: Ordinal Not in Range(128)
How to Add Timezone into a Naive Datetime Instance in Python
How to Iterate Through Dictionary in a Dictionary in Django Template