Splitting on first occurrence
From the docs:
str.split([sep[, maxsplit]])
Return a list of the words in the string, using sep as the delimiter string. If maxsplit is given, at most maxsplit splits are done (thus, the list will have at most
maxsplit+1
elements).
s.split('mango', 1)[1]
split string only on first instance of specified character
Use capturing parentheses:
'good_luck_buddy'.split(/_(.*)/s)
['good', 'luck_buddy', ''] // ignore the third element
They are defined as
If
separator
contains capturing parentheses, matched results are returned in the array.
So in this case we want to split at _.*
(i.e. split separator being a sub string starting with _
) but also let the result contain some part of our separator (i.e. everything after _
).
In this example our separator (matching _(.*)
) is _luck_buddy
and the captured group (within the separator) is lucky_buddy
. Without the capturing parenthesis the luck_buddy
(matching .*
) would've not been included in the result array as it is the case with simple split
that separators are not included in the result.
We use the s
regex flag to make .
match on newline (\n
) characters as well, otherwise it would only split to the first newline.
Split string when first occurence of a number
Try splitting on the first occurrence of [ ](?=\d)
:
text = "MARIA APARECIDA 99223-2000 / 98450-8026"
parts = re.split(r' (?=\d)', text, 1)
print(parts)
This prints:
['MARIA APARECIDA', '99223-2000 / 98450-8026']
Note that the regex pattern used splits and consumes a single space, but does not consume the digit that follows (lookaheads do not advance the position in the input).
Is there a function to split by the FIRST instance of a delimiter in C?
The first time you call strtok
, use the delimiter you want to split with.
For the second call, use an empty delimiter string (if you really want the rest of the string) or use "\n"
, in the case that your string might include a newline character and you don't want that in the split (or even "\r\n"
):
const char* first = strtok(buf, ":");
const char* rest = strtok(NULL, "");
/* or: const char* rest = strtok(NULL, "\n"); */
Split string at separator after first occurrence
You can try to use regular expressions for this job.
Just note that this is an extremely specific (and, at the same time generic) regular expression based on your only sole example.
import re
_REGEX = re.compile('^(((\.\.?)?\/)*[^\/]*)((\/?(\.\.)?)*)$')
def split_path(path):
structure = _REGEX.match(path or '').groups()
return structure[0], structure[3]
Testing
>>> split_path('../../../folder.123/../..')
('../../../folder.123', '/../..')
>>> split_path('../../../folder.123')
('../../../folder.123', '')
>>> split_path('folder.123')
('folder.123', '')
>>> split_path('/')
('/', '')
>>> split_path('')
('', '')
split string only on first instance - java
string.split("=", limit=2);
As String.split(java.lang.String regex, int limit)
explains:
The array returned by this method contains each substring of this string that is terminated by another substring that matches the given expression or is terminated by the end of the string. The substrings in the array are in the order in which they occur in this string. If the expression does not match any part of the input then the resulting array has just one element, namely this string.
The
limit
parameter controls the number of times the pattern is applied and therefore affects the length of the resulting array. If the limit n is greater than zero then the pattern will be applied at most n - 1 times, the array's length will be no greater than n, and the array's last entry will contain all input beyond the last matched delimiter.
The string
boo:and:foo
, for example, yields the following results with these parameters:
Regex Limit Result
: 2 { "boo", "and:foo" }
: 5 { "boo", "and", "foo" }
: -2 { "boo", "and", "foo" }
o 5 { "b", "", ":and:f", "", "" }
o -2 { "b", "", ":and:f", "", "" }
o 0 { "b", "", ":and:f" }
Split String at First Occurrence of an Integer using R
You can use tidyr::extract
:
library(tidyr)
df <- df %>%
extract("name_and_address", c("name", "address"), "(\\D*)(\\d.*)")
## => df
## name address
## 1 Mr. Smith 12 Some street
## 2 Mr. Jones 345 Another street
## 3 Mr. Anderson 6 A different street
The (\D*)(\d.*)
regex matches the following:
(\D*)
- Group 1: any zero or more non-digit chars(\d.*)
- Group 2: a digit and then any zero or more chars as many as possible.
Another solution with stringr::str_split
is also possible:
str_split(df$name_and_address, "(?=\\d)", n=2)
## => [[1]]
## [1] "Mr. Smith" "12 Some street"
## [[2]]
## [1] "Mr. Jones" "345 Another street"
## [[3]]
## [1] "Mr. Anderson" "6 A different street"
The (?=\d)
positive lookahead finds a location before a digit, and n=2
tells stringr::str_split
to only split into 2 chunks max.
Base R approach that does not return anything if there is no digit in the string:
df = data.frame(name_and_address = c("Mr. Smith12 Some street", "Mr. Jones345 Another street", "Mr. Anderson6 A different street", "1 digit is at the start", "No digits, sorry."))
df$name <- sub("^(?:(\\D*)\\d.*|.+)", "\\1", df$name_and_address)
df$address <- sub("^\\D*(\\d.*)?", "\\1", df$name_and_address)
df$name
# => [1] "Mr. Smith" "Mr. Jones" "Mr. Anderson" "" ""
df$address
# => [1] "12 Some street" "345 Another street"
# [3] "6 A different street" "1 digit is at the start" ""
See an online R demo. This also supports cases when the first digit is the first char in the string.
Split string using separator skipping first occurrence
You can use the str.rsplit
method with a maxsplit
of 1
instead:
file_path.rsplit('/', maxsplit=1)[0]
How can i split a string into two on the first occurrence of a character
str.split takes a maxsplit argument, pass 1 to only split on the first -
:
print components[i].rstrip().split('-',1)
To store the output in two variables:
In [7]: s = "console-3.45.1-0"
In [8]: a,b = s.split("-",1)
In [9]: a
Out[9]: 'console'
In [10]: b
Out[10]: '3.45.1-0'
Related Topics
How to Install Python 3.X and 2.X on the Same Windows Computer
How to Run a Function Periodically in Python
Python Multiprocessing: Handling Child Errors in Parent
How to Convert a File to Utf-8 in Python
How to Get the Largest Integer One Can Use in Python
Executing Command Using Paramiko Exec_Command on Device Is Not Working
How to Implement Option Buttons and Change the Button Color in Pygame
The Problem with Installing Pil Using Virtualenv or Buildout
Debugging (Displaying) SQL Command Sent to the Db by SQLalchemy
Global Variable from a Different File Python
Is It Still Necessary to Install Cuda Before Using the Conda Tensorflow-Gpu Package
How to Set Env Variable in Jupyter Notebook
Comparing Python Dictionaries and Nested Dictionaries