Character "|" in Strsplit Function (Vertical Bar/Pipe)

strsplit with vertical bar (pipe)

As you can read on ?strsplit, the argument split in function strsplit is a regular expression. Hence either you need to escape the vertical bar (it is a special character)

strsplit(r,split='\\|and')

or you can choose fixed=TRUE to indicate that split is not a regular expression

strsplit(r,split='|and',fixed=TRUE)

Character | in strsplit function (vertical bar / pipe)

It's because the split argument is interpreted as a regular expression, and | is a special character in a regex.

To get round this, you have two options:

Option 1: Escape the |, i.e. split = "\\|"

strsplit("ty|rr", split = "\\|")
[[1]]
[1] "ty" "rr"

Option 2: Specify fixed = TRUE:

strsplit("ty|rr", split = "|", fixed = TRUE)
[[1]]
[1] "ty" "rr"

Please also note the See Also section of ?strsplit, which tells you to read ?"regular expression" for details of the pattern specification.

How to strsplit using '|' character, it behaves unexpectedly?

The problem is that by default strsplit interprets " | " as a regular expression, in which | has special meaning (as "or").

Use fixed argument:

unlist(strsplit("I am | very smart", " | ", fixed=TRUE))
# [1] "I am" "very smart"

Side effect is faster computation.

stringr alternative:

unlist(stringr::str_split("I am | very smart", fixed(" | ")))

Splitting string by delimiter in R

This character that you are using has special meaning in regular expressions - it means OR. So your split pattern is like this:

empty string OR empty string == empty string

and that's why your input string is splitted char by char.
To use this as normal character without special regular expression meaning you have to escape it, like this:

strsplit(x, "\\|")

python : Split string separated by a pipe symbol |

The

parts = line.split['|']

should be

parts = line.split('|')

(i.e. with parentheses instead of square brackets.)

Trying to split a string based on the character | doesn't work

You need to use fixed = TRUE to interpret meta-characters such as | literally:

string <- "Hello | Good bye!"
strsplit(string, "|", fixed = TRUE)
#[[1]]
#[1] "Hello " " Good bye!"

Similarly,

strsplit("Hello . Good bye!", ".")[[1]]
#[1] "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" ""
strsplit("Hello . Good bye!", ".", fixed = TRUE)[[1]]
#[1] "Hello " " Good bye!"

Alternatively, you can manually escape such characters with double backslashes,

strsplit("Hello | Good bye!", "\\|")[[1]]
#[1] "Hello " " Good bye!"

or wrap them with \\Q...\\E, which will escape all non-alphanumeric characters:

strsplit("Hello | Good bye!", "\\Q|\\E")[[1]]
#[1] "Hello " " Good bye!"

Splitting string with pipe character (|)

| is a metacharacter in regex. You'd need to escape it:

String[] value_split = rat_values.split("\\|");

How do I split a string in Java?

Use the appropriately named method String#split().

String string = "004-034556";
String[] parts = string.split("-");
String part1 = parts[0]; // 004
String part2 = parts[1]; // 034556

Note that split's argument is assumed to be a regular expression, so remember to escape special characters if necessary.

there are 12 characters with special meanings: the backslash \, the caret ^, the dollar sign $, the period or dot ., the vertical bar or pipe symbol |, the question mark ?, the asterisk or star *, the plus sign +, the opening parenthesis (, the closing parenthesis ), and the opening square bracket [, the opening curly brace {, These special characters are often called "metacharacters".

For instance, to split on a period/dot . (which means "any character" in regex), use either backslash \ to escape the individual special character like so split("\\."), or use character class [] to represent literal character(s) like so split("[.]"), or use Pattern#quote() to escape the entire string like so split(Pattern.quote(".")).

String[] parts = string.split(Pattern.quote(".")); // Split on the exact string.

To test beforehand if the string contains certain character(s), just use String#contains().

if (string.contains("-")) {
// Split it.
} else {
throw new IllegalArgumentException("String " + string + " does not contain -");
}

Note, this does not take a regular expression. For that, use String#matches() instead.

If you'd like to retain the split character in the resulting parts, then make use of positive lookaround. In case you want to have the split character to end up in left hand side, use positive lookbehind by prefixing ?<= group on the pattern.

String string = "004-034556";
String[] parts = string.split("(?<=-)");
String part1 = parts[0]; // 004-
String part2 = parts[1]; // 034556

In case you want to have the split character to end up in right hand side, use positive lookahead by prefixing ?= group on the pattern.

String string = "004-034556";
String[] parts = string.split("(?=-)");
String part1 = parts[0]; // 004
String part2 = parts[1]; // -034556

If you'd like to limit the number of resulting parts, then you can supply the desired number as 2nd argument of split() method.

String string = "004-034556-42";
String[] parts = string.split("-", 2);
String part1 = parts[0]; // 004
String part2 = parts[1]; // 034556-42

Using strsplit wid pipe separator in R

Just needs to add to backslash before the bar:

strsplit(x, "\\|")

For example:

> x <- "Hello | Could you help me please?"
> strsplit(x, "\\|")
[[1]]
[1] "Hello " " Could you help me please?"

Splitting a Java String by the pipe symbol using split(|)

You need

test.split("\\|");

split uses regular expression and in regex | is a metacharacter representing the OR operator. You need to escape that character using \ (written in String as "\\" since \ is also a metacharacter in String literals and require another \ to escape it).

You can also use

test.split(Pattern.quote("|"));

and let Pattern.quote create the escaped version of the regex representing |.



Related Topics



Leave a reply



Submit