How to split by commas that are not within parentheses?
Use a negative lookahead to match all the commas which are not inside the parenthesis. Splitting the input string according to the matched commas will give you the desired output.
,\s*(?![^()]*\))
DEMO
>>> import re
>>> s = "Water, Titanium Dioxide (CI 77897), Black 2 (CI 77266), Iron Oxides (CI 77491, 77492, 77499), Ultramarines (CI 77007)"
>>> re.split(r',\s*(?![^()]*\))', s)
['Water', 'Titanium Dioxide (CI 77897)', 'Black 2 (CI 77266)', 'Iron Oxides (CI 77491, 77492, 77499)', 'Ultramarines (CI 77007)']
Split string by comma, but ignore commas within brackets
This regex works on your example:
,(?=[^,]+?:)
Here, we use a positive lookahead to look for commas followed by non-comma and colon characters, then a colon. This correctly finds the <comma><key>
pattern you are searching for. Of course, if the keys are allowed to have commas, this would have to be adapted a little further.
You can check out the regexr here
split by comma if comma not in between brackets while allowing characters to be outside the brackets with in the same comma split
You may use this regex with a lookahead for split:
>>> s = """aa,bb,(cc,dd),m(ee,ff)"""
>>> print ( re.split(r',(?![^()]*\))', s) )
['aa', 'bb', '(cc,dd)', 'm(ee,ff)']
RegEx Demo
RegEx Details:
,
: Match a comma(?![^()]*\))
: A negative lookahead assertion that makes sure we don't match comma inside(...)
by asserting that there is no)
ahead after 0 or more not bracket characters.
Split string by comma if not within square brackets or parentheses
You can use the following regex with global flag.
,(?![^\(\[]*[\]\)])
Here is a demo.
It is inspired by https://stackoverflow.com/a/9030062/1630604.
Regex split by comma not inside parenthesis (.NET)
This PCRE regex - (\((?:[^()]++|(?1))*\))(*SKIP)(*F)|,
- uses recursion, .NET does not support it, but there is a way to do the same thing using balancing construct. The From the PCRE verbs - (*SKIP)
and (*FAIL)
- only (*FAIL)
can be written as (?!)
(it causes an unconditional fail at the place where it stands), .NET does not support skipping a match at a specific position and resuming search from that failed position.
I suggest replacing all commas that are not inside nested parentheses with some temporary value, and then splitting the string with that value:
var s = Regex.Replace(text, @"\((?>[^()]+|(?<o>)\(|(?<-o>)\))*(?(o)(?!))\)|(,)", m =>
m.Groups[1].Success ? "___temp___" : m.Value);
var results = s.Split("___temp___");
Details
\((?>[^()]+|(?<o>)\(|(?<-o>)\))*(?(o)(?!))\)
- a pattern that matches nested parentheses:\(
- a(
char(?>[^()]+|(?<o>)\(|(?<-o>)\))*
- 0 or more occurrences of[^()]+|
- 1+ chars other than(
and)
or(?<o>)\(|
- a(
and a value is pushed on to the Group "o" stack(?<-o>)\)
- a)
and a value is popped from the Group "o" stack
(?(o)(?!))
- a conditional construct that fails the match if Group "o" stack is not empty\)
- a)
char
|
- or(,)
- Group 1: a comma
Only the comma captured in Group 1 is replaced with a temp substring since the m.Groups[1].Success
check is performed in the match evaluator part.
explode commas but ignore commas within brackets php
We can make a slight correction to your current regex splitting logic by using the following pattern:
,(?![^(]+\))
This says to split on comma, but only if that comma does not occur inside a terms in parentheses. It works by using a negative lookahead checking that we do not see a )
without first seeing an opening (
, which would imply that the comma be inside a (...)
term.
$string = "Beer - Domestic,Food - Snacks (chips,dips,nuts),Beer - Imported,UNCATEGORIZED";
$keywords = preg_split("/,(?![^(]+\))/", $string);
print_r($keywords);
This prints:
Array
(
[0] => Beer - Domestic
[1] => Food - Snacks (chips,dips,nuts)
[2] => Beer - Imported
[3] => UNCATEGORIZED
)
how to split string into array on commas but ignore commas in parentheses
You may use ,(?![^\(]*[\)])
with a list comprehension:
s = '''
a VARCHAR(20),
b FLOAT, c FLOAT,
d NUMBER(38,0), e NUMBER(38,0)
'''
[i.strip() for i in re.split(r',(?![^\(]*[\)])', s)]
# ['a VARCHAR(20)', 'b FLOAT', 'c FLOAT', 'd NUMBER(38,0)', 'e NUMBER(38,0)']
Related Topics
Regex to Match Digits and At Most One Space Between Them
Correctly Reading Text from Windows-1252(Cp1252) File in Python
Python - How to Check If Table Exists
How to Print Colored Text to the Terminal
How to Download Multiple Files or an Entire Folder from Google Colab
Python) I Wanna Add Two Lists Which Are Different Order of Len
Typeerror: Strptime() Argument 1 Must Be Str, Not List
How to Remove Commas and Dots of Individual Word in Two Dimensional List
Visual Studio Code Intellisense Is Very Slow - Is There Anything I Can Do
Best Practice to Run Multiple Spark Instance At a Time in Same Jvm
Using Tkinter in Python to Edit the Title Bar
How to Easily Print Ascii-Art Text
How to Make My Discord.Py Bot Play Mp3 in Voice Channel
Navigating Through Pagination With Selenium in Python
Iterate Through a List by Skipping Every 5Th Element