Use String.split() with multiple delimiters
I think you need to include the regex OR operator:
String[]tokens = pdfName.split("-|\\.");
What you have will match:
[DASH followed by DOT together] -.
not
[DASH or DOT any of them] -
or .
Split multiple delimiters in Java
Try with
split("\\t|,|;|\\.|\\?|!|-|:|@|\\[|\\]|\\(|\\)|\\{|\\}|_|\\*|/");
Also
Use String.split() with multiple delimiters
String Splitting with multiple delimiters in JavaScript
try this
"2020-01-31".split(/[/-]/ig)
var dateParts1 = "2020-01-31".split(/[/-]/ig);console.log(dateParts1);
var dateParts2 = "2020/02/21".split(/[/-]/ig);console.log(dateParts2);
Python split string by multiple delimiters following a hierarchy
Try:
import re
tests = [
["121 34 adsfd", ["121 34 adsfd"]],
["dsfsd and adfd", ["dsfsd ", " adfd"]],
["dsfsd & adfd", ["dsfsd ", " adfd"]],
["dsfsd - adfd", ["dsfsd ", " adfd"]],
["dsfsd and adfd and adsfa", ["dsfsd ", " adfd and adsfa"]],
["dsfsd and adfd - adsfa", ["dsfsd ", " adfd - adsfa"]],
["dsfsd - adfd and adsfa", ["dsfsd - adfd ", " adsfa"]],
]
for s, result in tests:
res = re.split(r"and|&(?!.*and)|-(?!.*and|.*&)", s, maxsplit=1)
print(res)
assert res == result
Prints:
['121 34 adsfd']
['dsfsd ', ' adfd']
['dsfsd ', ' adfd']
['dsfsd ', ' adfd']
['dsfsd ', ' adfd and adsfa']
['dsfsd ', ' adfd - adsfa']
['dsfsd - adfd ', ' adsfa']
Explanation:
The regex and|&(?!.*and)|-(?!.*and|.*&)
uses 3 alternatives.
- We match
and
always or: - We match
&
only if there isn'tand
ahead (using the negative look-ahead(?! )
or: - We match
-
only if there isn'tand
or&
ahead.
We're using this pattern in re.sub
-> splitting only on the first match.
Split string with multiple delimiters in Python
Luckily, Python has this built-in :)
import re
re.split('; |, ',str)
Update:
Following your comment:
>>> a='Beautiful, is; better*than\nugly'
>>> import re
>>> re.split('; |, |\*|\n',a)
['Beautiful', 'is', 'better', 'than', 'ugly']
Split String with multiple delimiters and keep delimiters
Try with parenthesis:
>>> split_str = re.split("(and | or | & | /)", input_str)
>>> split_str
['X < -500', ' & ', 'Y > 3000', ' /', ' Z > 50']
>>>
If you want to remove extra spaces:
>>> split_str = [i.strip() for i in re.split("(and | or | & | /)", input_str)]
>>> split_str
['X < -500', '&', 'Y > 3000', '/', ' Z > 50']
>>>
Splitting strings using multiple delimiters- in Python. Getting TypeError: expected string or bytes-like object
re is a library that recieves a String type, not a Pandas dataframe column you should use an accessor in this case
df[['A']] = df['Sport'].str.split(r';,')
I hope it resolves your problem
Related Topics
Mapstruct: Map List of Objects, When Object Is Mapped from Two Objects
How to Open a .Dat File in Java Program
How to Remove Line Breaks from a File in Java
Javax.Xml.Bind.Unmarshalexception: Unexpected Element (Uri:"", Local:"Group")
Could Not Extract Resultset When Performing Customized Native Query
How to Check Whether Kafka Server Is Running
Pdf to Byte Array and Vice Versa
Spring @Requestbody and Enum Value
Getting 400 for Spring Resttemplate Post
Tomcat Is Running But 8080 Port Is Not Responding
Get Requestbody and Responsebody At Handlerinterceptor
How to Query Using an Enum Parameter Mapped as Ordinal Using JPA and Hibernate
How to Launch Command in Batch File With Space
Print Array Without Brackets and Commas
How to Fix Mass Assignment: Insecure Binder Configuration (Api Abuse, Structural) in Java