Splitting String With Parentheses

Groovy/Java split string on parentheses (

println "Hello World(1)".split("\\(");

Splitting string with parentheses

You may use

String[] results = s.split("\\s*[()]\\s*");

See the regex demo

Pattern details

  • \\s* - 0+ whitespaces
  • [()] - a ) or (
  • \\s* - 0+ whitespaces

If your strings are always in the format specified (no parentheses, (...), no parentheses), you will have:

Name with space                      = results[0]
The values inside the brackets = results[1]
The CONST value after the brackets = results[2]

If you want a more controlled approach use a matching regex:

Pattern.compile("^([^()]*)\\(([^()]*)\\)(.*)$")

See the regex demo

If you use it with Matcher#matches(), you may omit ^ and $ since that method requires a full string match.

Java demo:

String regex = "^([^()]*)\\(([^()]*)\\)(.*)$";
String s = "flow gavage(ZAB_B2_COCUM) BS";
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(s);
if (matcher.matches()){
System.out.println(matcher.group(1).trim());
System.out.println(matcher.group(2).trim());
System.out.println(matcher.group(3).trim());
}

Here, the pattern means:

  • ^ - start of the string (implicit in .matches())
  • ([^()]*) - Capturing group 1: any 0+ chars other than ( and )
  • \\( - a (
  • ([^()]*) - Capturing group 2: any 0+ chars other than ( and )
  • \\) - a )
  • (.*) - Capturing group 3: any 0+ chars, as many as possible, up to the end of the line (use ([^()]*) if you need to restrict ( and ) in this part, too).
  • $ - end of string (implicit in .matches())

How to split string while ignoring portion in parentheses?

Instead of focusing on what you do not want it's often easier to express as a regular expression what you want, and to match that with a global regex:

var str = "bibendum, morbi, non, quam (nec, dui, luctus), rutrum, nulla";
str.match(/[^,]+(?:\(+*?\))?/g) // the simple one
str.match(/[^,\s]+(?:\s+\([^)]*\))?/g) // not matching whitespaces

R string split on parentheses, keeping the parentheses in the split with its content

You can use

x <- "A(B)C"
library(stringr)
str_extract_all(x, "\\([^()]*\\)|[^()]+")

See the R demo and the regex demo. Details:

  • \([^()]*\) - a (, zero or more chars other than ( and ) and then )
  • | - or
  • [^()]+ - one or more chars other than ( and ).

Splitting a string with nested parentheses at only the top level where level is determined by the parentheses

If your character vectors are in the format you showed, you can achieve what you need with a single PCRE regex:

(?:\G(?!^)\s*,\s*|^N\()\K(?:\d+|\w+(\([^()]*(?:(?1)[^()]*)*\)))(?=\s*,|\)$)

See the regex demo. Details

  • (?:\G(?!^)\s*,\s*|^N\() - end of the previous successful match (\G(?!^)) and then a comma enclosed with zero or more whitespace chars (\s*,\s*) or a N( string at the start of the string (^N\()
  • \K - a match reset operator that discards all text matched so far from the current match memory buffer
  • (?: - start of non-capturing group
    • \d+ - one or more digits
    • | - or
    • \w+ - one or more word chars
    • (\([^()]*(?:(?1)[^()]*)*\)) - Group 1 (needed for recursion to work correctly): a (, then any zero or more chars other than a ( and ), then zero or more occurrences of the Group 1 pattern (recursed) and then zero or more chars other than ( and ) and then a ) char
  • ) - end of the non-capturing group
  • (?=\s*,|\)$) - immediately followed with zero or more whitespaces and then a comma or ) char at the end of string.

See the regex demo:

strs <- c("N(0, 1)", "N(N(0.1, 1), 1)", "N(U(0, 1), 1)", "N(0, T(0, 1))", "N(N(0, 1), N(0, 1))")
p <- "(?:\\G(?!^)\\s*,\\s*|^N\\()\\K(?:\\d+|\\w+(\\([^()]*(?:(?1)[^()]*)*\\)))(?=\\s*,|\\)$)"
regmatches(strs, gregexpr(p, strs, perl=TRUE))
# => [[1]]
# [1] "0" "1"
#
# [[2]]
# [1] "N(0.1, 1)" "1"
#
# [[3]]
# [1] "U(0, 1)" "1"
#
# [[4]]
# [1] "0" "T(0, 1)"
#
# [[5]]
# [1] "N(0, 1)" "N(0, 1)"

Splitting a string in C# to extract values in parentheses and keep them

You can use Regex.Matches method instead

        string phrase = "123(0)(1)";
string[] results = Regex.Matches(phrase, @"\(.*?\)").Cast<Match>().Select(m => m.Value).ToArray();

Split string based on parentheses in javascript

You can match all up to the first ( and then all between that first ( and the ) that is at the end of the string, and use