Split String Last Delimiter

Splitting on last delimiter in Python string?

Use .rsplit() or .rpartition() instead:

s.rsplit(',', 1)
s.rpartition(',')

str.rsplit() lets you specify how many times to split, while str.rpartition() only splits once but always returns a fixed number of elements (prefix, delimiter & postfix) and is faster for the single split case.

Demo:

>>> s = "a,b,c,d"
>>> s.rsplit(',', 1)
['a,b,c', 'd']
>>> s.rsplit(',', 2)
['a,b', 'c', 'd']
>>> s.rpartition(',')
('a,b,c', ',', 'd')

Both methods start splitting from the right-hand-side of the string; by giving str.rsplit() a maximum as the second argument, you get to split just the right-hand-most occurrences.

If you only need the last element, but there is a chance that the delimiter is not present in the input string or is the very last character in the input, use the following expressions:

# last element, or the original if no `,` is present or is the last character
s.rsplit(',', 1)[-1] or s
s.rpartition(',')[-1] or s

If you need the delimiter gone even when it is the last character, I'd use:

def last(string, delimiter):
"""Return the last element from string, after the delimiter

If string ends in the delimiter or the delimiter is absent,
returns the original string without the delimiter.

"""
prefix, delim, last = string.rpartition(delimiter)
return last if (delim and last) else prefix

This uses the fact that string.rpartition() returns the delimiter as the second argument only if it was present, and an empty string otherwise.

Split string on the last occurrence of some character

It might be easier to just assume that files which end with a dot followed by alphanumeric characters have extensions.

int p=filePath.lastIndexOf(".");
String e=filePath.substring(p+1);
if( p==-1 || !e.matches("\\w+") ){/* file has no extension */}
else{ /* file has extension e */ }

See the Java docs for regular expression patterns. Remember to escape the backslash because the pattern string needs the backslash.

split string last delimiter

These use no packages. They assume that each element of col2 has at least one underscore. (See note if lifting this restriction is needed.)

1) The first regular expression (.*)_ matches everything up to the last underscore followed by everything remaining .* and the first sub replaces the entire match with the matched part within parens. This works because such matches are greedy so the first .* will take everything it can leaving the rest for the second .* . The second regular expression matches everything up to the last underscore and the second sub replaces that with the empty string.

transform(df, col2 = sub("(.*)_.*", "\\1", col2), col3 = sub(".*_", "", col2))

2) Here is a variation that is a bit more symmetric. It uses the same regular expression for both sub calls.

pat <- "(.*)_(.*)"
transform(df, col2 = sub(pat, "\\1", col2), col3 = sub(pat, "\\2", col2))

Note: If we did want to handle strings with no underscore at all such that "xyz" is split into "xyz" and "" then use this for the second sub. It tries to match the left hand side of the | first and if that fails (which will occur if there are no underscores) then the entire string will match the right hand side and sub will replace that with the empty string.

sub(".*_|^[^_]*$", "", col2)

How to split a string at the last occurence of a sequence

The range(of:...) method of String has a .backwards option
to find the last occurrence of a string.
Then substring(to:) and substring(from:) can be used with the
lower/upper bound of that range to extract the parts of the string
preceding/following the separator:

func parseTuple(from string: String) -> (String, Int)? {

if let theRange = string.range(of: "###", options: .backwards),
let i = Int(string.substring(from: theRange.upperBound)) {
return (string.substring(to: theRange.lowerBound), i)
} else {
return nil
}
}

Example:

if let tuple = parseTuple(from: "Connect###Four###Player###7") {
print(tuple)
// ("Connect###Four###Player", 7)
}

Swift 4 update:

func parseTuple(from string: String) -> (String, Int)? {

if let theRange = string.range(of: "###", options: .backwards),
let i = Int(string[theRange.upperBound...]) {
return (String(string[...theRange.lowerBound]), i)
} else {
return nil
}
}

Split Character String Using Only Last Delimiter in r

A solution based on stringi and data.table: reverse the string and split it into fixed items and then reverse back:

library(stringi)
x <- c('foo - bar', 'hey-now-man', 'say-now-girl', 'fine-now')

lapply(stri_split_regex(stri_reverse(x), pattern = '[-\\s]+', n = 2), stri_reverse)

If we want to make a data.frame with this:

y <- lapply(stri_split_regex(stri_reverse(x), pattern = '[-\\s]+', n = 2), stri_reverse)

y <- setNames(data.table::transpose(y)[2:1], c('output1', 'output2'))

df <- as.data.frame(c(list(input = x), y))

# > df
# input output1 output2
# 1 foo - bar foo bar
# 2 hey-now-man hey-now man
# 3 say-now-girl say-now girl
# 4 fine-now fine now

pandas split by last delimiter

With Series.str.rsplit, limiting the number of splits.

df.col1.str.rsplit('|', 1, expand=True).rename(lambda x: f'col{x + 1}', axis=1)

If the above throws you a SyntaxError, it means you're on a python version older than 3.6 (shame on you!). Use instead

df.col1.str.rsplit('|', 1, expand=True)\
.rename(columns=lambda x: 'col{}'.format(x + 1))

col1 col2
0 MLB|NBA NFL
1 MLB NBA
2 NFL|NHL|NBA MLB

There's also the faster loopy str.rsplit equivalent.

pd.DataFrame(
[x.rsplit('|', 1) for x in df.col1.tolist()],
columns=['col1', 'col2']
)
col1 col2
0 MLB|NBA NFL
1 MLB NBA
2 NFL|NHL|NBA MLB

P.S., yes, the second solution is faster:

df = pd.concat([df] * 100000, ignore_index=True)

%timeit df.col1.str.rsplit('|', 1, expand=True)
%timeit pd.DataFrame([x.rsplit('|', 1) for x in df.col1.tolist()])

473 ms ± 13.8 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
128 ms ± 1.29 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

How to split a string in shell and get the last field

You can use string operators:

$ foo=1:2:3:4:5
$ echo ${foo##*:}
5

This trims everything from the front until a ':', greedily.

${foo  <-- from variable foo
## <-- greedy front trim
* <-- matches anything
: <-- until the last ':'
}

Split a string at the last occurrence of the separator in golang

Since this is for path operations, and it looks like you don't want the trailing path separator, then path.Dir does what you're looking for:

fmt.Println(path.Dir("a/b/c/d/e"))
// a/b/c/d

If this is specifically for filesystem paths, you will want to use the filepath package instead, to properly handle multiple path separators.

Second-to-last occurrence of delimiter-split string

Another option could be just a match with a negative lookahead assertion, and exclude matching newlines before asserting the end of the string.

\w+(?=,[^,\n]*$)

Regex demo

How to split a string into 2 at the last occurrence of an underscore character

You can use lastIndexOf on String which returns you the index of the last occurrence of a chain of caracters.

String thing = "132131_12313_1321_312";
int index = thing.lastIndexOf("_");
String yourCuttedString = thing.substring(0, index);

It returns -1 if the occurrence is not found in the String.



Related Topics



Leave a reply



Submit