How to Split a String with a String Delimiter

How can I split a string with a string delimiter?

string[] tokens = str.Split(new[] { "is Marco and" }, StringSplitOptions.None);

If you have a single character delimiter (like for instance ,), you can reduce that to (note the single quotes):

string[] tokens = str.Split(',');

Parse (split) a string in C++ using string delimiter (standard C++)

You can use the std::string::find() function to find the position of your string delimiter, then use std::string::substr() to get a token.

Example:

std::string s = "scott>=tiger";
std::string delimiter = ">=";
std::string token = s.substr(0, s.find(delimiter)); // token is "scott"
  • The find(const string& str, size_t pos = 0) function returns the position of the first occurrence of str in the string, or npos if the string is not found.

  • The substr(size_t pos = 0, size_t n = npos) function returns a substring of the object, starting at position pos and of length npos.


If you have multiple delimiters, after you have extracted one token, you can remove it (delimiter included) to proceed with subsequent extractions (if you want to preserve the original string, just use s = s.substr(pos + delimiter.length());):

s.erase(0, s.find(delimiter) + delimiter.length());

This way you can easily loop to get each token.

Complete Example

std::string s = "scott>=tiger>=mushroom";
std::string delimiter = ">=";

size_t pos = 0;
std::string token;
while ((pos = s.find(delimiter)) != std::string::npos) {
token = s.substr(0, pos);
std::cout << token << std::endl;
s.erase(0, pos + delimiter.length());
}
std::cout << s << std::endl;

Output:

scott
tiger
mushroom

String delimiter in string.split method

There is no need to set the delimiter by breaking it up in pieces like you have done.

Here is a complete program you can compile and run:

import java.util.Arrays;
public class SplitExample {
public static final String PLAYER = "1||1||Abdul-Jabbar||Karim||1996||1974";
public static void main(String[] args) {
String[] data = PLAYER.split("\\|\\|");
System.out.println(Arrays.toString(data));
}
}

If you want to use split with a pattern, you can use Pattern.compile or Pattern.quote.

To see compile and quote in action, here is an example using all three approaches:

import java.util.Arrays;
import java.util.regex.Pattern;
public class SplitExample {
public static final String PLAYER = "1||1||Abdul-Jabbar||Karim||1996||1974";
public static void main(String[] args) {
String[] data = PLAYER.split("\\|\\|");
System.out.println(Arrays.toString(data));

Pattern pattern = Pattern.compile("\\|\\|");
data = pattern.split(PLAYER);
System.out.println(Arrays.toString(data));

pattern = Pattern.compile(Pattern.quote("||"));
data = pattern.split(PLAYER);
System.out.println(Arrays.toString(data));
}
}

The use of patterns is recommended if you are going to split often using the same pattern. BTW the output is:

[1, 1, Abdul-Jabbar, Karim, 1996, 1974]
[1, 1, Abdul-Jabbar, Karim, 1996, 1974]
[1, 1, Abdul-Jabbar, Karim, 1996, 1974]

How do I split a string in Java?

Use the appropriately named method String#split().

String string = "004-034556";
String[] parts = string.split("-");
String part1 = parts[0]; // 004
String part2 = parts[1]; // 034556

Note that split's argument is assumed to be a regular expression, so remember to escape special characters if necessary.

there are 12 characters with special meanings: the backslash \, the caret ^, the dollar sign $, the period or dot ., the vertical bar or pipe symbol |, the question mark ?, the asterisk or star *, the plus sign +, the opening parenthesis (, the closing parenthesis ), and the opening square bracket [, the opening curly brace {, These special characters are often called "metacharacters".

For instance, to split on a period/dot . (which means "any character" in regex), use either backslash \ to escape the individual special character like so split("\\."), or use character class [] to represent literal character(s) like so split("[.]"), or use Pattern#quote() to escape the entire string like so split(Pattern.quote(".")).

String[] parts = string.split(Pattern.quote(".")); // Split on the exact string.

To test beforehand if the string contains certain character(s), just use String#contains().

if (string.contains("-")) {
// Split it.
} else {
throw new IllegalArgumentException("String " + string + " does not contain -");
}

Note, this does not take a regular expression. For that, use String#matches() instead.

If you'd like to retain the split character in the resulting parts, then make use of positive lookaround. In case you want to have the split character to end up in left hand side, use positive lookbehind by prefixing ?<= group on the pattern.

String string = "004-034556";
String[] parts = string.split("(?<=-)");
String part1 = parts[0]; // 004-
String part2 = parts[1]; // 034556

In case you want to have the split character to end up in right hand side, use positive lookahead by prefixing ?= group on the pattern.

String string = "004-034556";
String[] parts = string.split("(?=-)");
String part1 = parts[0]; // 004
String part2 = parts[1]; // -034556

If you'd like to limit the number of resulting parts, then you can supply the desired number as 2nd argument of split() method.

String string = "004-034556-42";
String[] parts = string.split("-", 2);
String part1 = parts[0]; // 004
String part2 = parts[1]; // 034556-42

How do I split a string by a multi-character delimiter in C#?

http://msdn.microsoft.com/en-us/library/system.string.split.aspx

Example from the docs:

string source = "[stop]ONE[stop][stop]TWO[stop][stop][stop]THREE[stop][stop]";
string[] stringSeparators = new string[] {"[stop]"};
string[] result;

// ...
result = source.Split(stringSeparators, StringSplitOptions.None);

foreach (string s in result)
{
Console.Write("'{0}' ", String.IsNullOrEmpty(s) ? "<>" : s);
}

Split a string into an array of strings based on a delimiter

you can use the TStrings.DelimitedText property for split an string

check this sample

program Project28;

{$APPTYPE CONSOLE}

uses
Classes,
SysUtils;

procedure Split(Delimiter: Char; Str: string; ListOfStrings: TStrings) ;
begin
ListOfStrings.Clear;
ListOfStrings.Delimiter := Delimiter;
ListOfStrings.StrictDelimiter := True; // Requires D2006 or newer.
ListOfStrings.DelimitedText := Str;
end;


var
OutPutList: TStringList;
begin
OutPutList := TStringList.Create;
try
Split(':', 'word:doc,txt,docx', OutPutList) ;
Writeln(OutPutList.Text);
Readln;
finally
OutPutList.Free;
end;
end.

UPDATE

See this link for an explanation of StrictDelimiter.

How to split a string, but also keep the delimiters?

You can use lookahead and lookbehind, which are features of regular expressions.

System.out.println(Arrays.toString("a;b;c;d".split("(?<=;)")));
System.out.println(Arrays.toString("a;b;c;d".split("(?=;)")));
System.out.println(Arrays.toString("a;b;c;d".split("((?<=;)|(?=;))")));

And you will get:

[a;, b;, c;, d]
[a, ;b, ;c, ;d]
[a, ;, b, ;, c, ;, d]

The last one is what you want.

((?<=;)|(?=;)) equals to select an empty character before ; or after ;.

EDIT: Fabian Steeg's comments on readability is valid. Readability is always a problem with regular expressions. One thing I do to make regular expressions more readable is to create a variable, the name of which represents what the regular expression does. You can even put placeholders (e.g. %1$s) and use Java's String.format to replace the placeholders with the actual string you need to use; for example:

static public final String WITH_DELIMITER = "((?<=%1$s)|(?=%1$s))";

public void someMethod() {
final String[] aEach = "a;b;c;d".split(String.format(WITH_DELIMITER, ";"));
...
}

How do I split a string on a delimiter in Bash?

You can set the internal field separator (IFS) variable, and then let it parse into an array. When this happens in a command, then the assignment to IFS only takes place to that single command's environment (to read ). It then parses the input according to the IFS variable value into an array, which we can then iterate over.

This example will parse one line of items separated by ;, pushing it into an array:

IFS=';' read -ra ADDR <<< "$IN"
for i in "${ADDR[@]}"; do
# process "$i"
done

This other example is for processing the whole content of $IN, each time one line of input separated by ;:

while IFS=';' read -ra ADDR; do
for i in "${ADDR[@]}"; do
# process "$i"
done
done <<< "$IN"

Split string with delimiters in C

You can use the strtok() function to split a string (and specify the delimiter to use). Note that strtok() will modify the string passed into it. If the original string is required elsewhere make a copy of it and pass the copy to strtok().

EDIT:

Example (note it does not handle consecutive delimiters, "JAN,,,FEB,MAR" for example):

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <assert.h>

char** str_split(char* a_str, const char a_delim)
{
char** result = 0;
size_t count = 0;
char* tmp = a_str;
char* last_comma = 0;
char delim[2];
delim[0] = a_delim;
delim[1] = 0;

/* Count how many elements will be extracted. */
while (*tmp)
{
if (a_delim == *tmp)
{
count++;
last_comma = tmp;
}
tmp++;
}

/* Add space for trailing token. */
count += last_comma < (a_str + strlen(a_str) - 1);

/* Add space for terminating null string so caller
knows where the list of returned strings ends. */
count++;

result = malloc(sizeof(char*) * count);

if (result)
{
size_t idx = 0;
char* token = strtok(a_str, delim);

while (token)
{
assert(idx < count);
*(result + idx++) = strdup(token);
token = strtok(0, delim);
}
assert(idx == count - 1);
*(result + idx) = 0;
}

return result;
}

int main()
{
char months[] = "JAN,FEB,MAR,APR,MAY,JUN,JUL,AUG,SEP,OCT,NOV,DEC";
char** tokens;

printf("months=[%s]\n\n", months);

tokens = str_split(months, ',');

if (tokens)
{
int i;
for (i = 0; *(tokens + i); i++)
{
printf("month=[%s]\n", *(tokens + i));
free(*(tokens + i));
}
printf("\n");
free(tokens);
}

return 0;
}

Output:

$ ./main.exe
months=[JAN,FEB,MAR,APR,MAY,JUN,JUL,AUG,SEP,OCT,NOV,DEC]

month=[JAN]
month=[FEB]
month=[MAR]
month=[APR]
month=[MAY]
month=[JUN]
month=[JUL]
month=[AUG]
month=[SEP]
month=[OCT]
month=[NOV]
month=[DEC]


Related Topics



Leave a reply



Submit