Splitting a csv file with quotes as text-delimiter using String.split()

public static void main(String[] args) {
String s = "Sachin,,M,\"Maths,Science,English\",Need to improve in these subjects.";
String[] splitted = s.split(",(?=([^\"]*\"[^\"]*\")*[^\"]*$)");


[Sachin, , M, "Maths,Science,English", Need to improve in these subjects.]

Java: splitting a comma-separated string but ignoring commas in quotes


public class Main { 
public static void main(String[] args) {
String line = "foo,bar,c;qual=\"baz,blurb\",d;junk=\"quux,syzygy\"";
String[] tokens = line.split(",(?=(?:[^\"]*\"[^\"]*\")*[^\"]*$)", -1);
for(String t : tokens) {
System.out.println("> "+t);


> foo
> bar
> c;qual="baz,blurb"
> d;junk="quux,syzygy"

In other words: split on the comma only if that comma has zero, or an even number of quotes ahead of it.

Or, a bit friendlier for the eyes:

public class Main { 
public static void main(String[] args) {
String line = "foo,bar,c;qual=\"baz,blurb\",d;junk=\"quux,syzygy\"";

String otherThanQuote = " [^\"] ";
String quotedString = String.format(" \" %s* \" ", otherThanQuote);
String regex = String.format("(?x) "+ // enable comments, ignore white spaces
", "+ // match a comma
"(?= "+ // start positive look ahead
" (?: "+ // start non-capturing group 1
" %s* "+ // match 'otherThanQuote' zero or more times
" %s "+ // match 'quotedString'
" )* "+ // end group 1 and repeat it zero or more times
" %s* "+ // match 'otherThanQuote'
" $ "+ // match the end of the string
") ", // stop positive look ahead
otherThanQuote, quotedString, otherThanQuote);

String[] tokens = line.split(regex, -1);
for(String t : tokens) {
System.out.println("> "+t);

which produces the same as the first example.


As mentioned by @MikeFHay in the comments:

I prefer using Guava's Splitter, as it has saner defaults (see discussion above about empty matches being trimmed by String#split(), so I did:


Splitting a CSV File in Java that has extra commas and extra quotes in them

Thanks to Andreas and Tamas Hegedus for helping you clarify the question! Try:

        br = new BufferedReader(new FileReader(customerListAllCustomers));
while ((line = br.readLine()) != null) {
// one column, so don't need to use comma as separator
String line2 = line.replaceAll("^\"","").replaceAll("\"$","").replaceAll("\\\"","\"");

The replaceAll calls strip leading quotes (^\") and trailing quotes (\"$), and then unescape the remaining quotes (\\\").

split a comma-separated string with both quoted and unquoted strings

Depending on your needs you may not be able to use a csv parser, and may in fact want to re-invent the wheel!!

You can do so with some simple regex


This will do the following:

(?:^|,) = Match expression "Beginning of line or string ,"

(\"(?:[^\"]+|\"\")*\"|[^,]*) = A numbered capture group, this will select between 2 alternatives:

  1. stuff in quotes
  2. stuff between commas

This should give you the output you are looking for.

Example code in C#

 static Regex csvSplit = new Regex("(?:^|,)(\"(?:[^\"]+|\"\")*\"|[^,]*)", RegexOptions.Compiled);

public static string[] SplitCSV(string input)

List<string> list = new List<string>();
string curr = null;
foreach (Match match in csvSplit.Matches(input))
curr = match.Value;
if (0 == curr.Length)


return list.ToArray();

private void button1_Click(object sender, RoutedEventArgs e)

Warning As per @MrE's comment - if a rogue new line character appears in a badly formed csv file and you end up with an uneven ("string) you'll get catastrophic backtracking (https://www.regular-expressions.info/catastrophic.html) in your regex and your system will likely crash (like our production system did). Can easily be replicated in Visual Studio and as I've discovered will crash it. A simple try/catch will not trap this issue either.

You should use:



How to split csv whose columns may contain comma

Use the Microsoft.VisualBasic.FileIO.TextFieldParser class. This will handle parsing a delimited file, TextReader or Stream where some fields are enclosed in quotes and some are not.

For example:

using Microsoft.VisualBasic.FileIO;

string csv = "2,1016,7/31/2008 14:22,Geoff Dalgas,6/5/2011 22:21,http://stackoverflow.com,\"Corvallis, OR\",7679,351,81,b437f461b3fd27387c5d8ab47a293d35,34";

TextFieldParser parser = new TextFieldParser(new StringReader(csv));

// You can also read from a file
// TextFieldParser parser = new TextFieldParser("mycsvfile.csv");

parser.HasFieldsEnclosedInQuotes = true;

string[] fields;

while (!parser.EndOfData)
fields = parser.ReadFields();
foreach (string field in fields)


This should result in the following output:

7/31/2008 14:22
Geoff Dalgas
6/5/2011 22:21
Corvallis, OR

See Microsoft.VisualBasic.FileIO.TextFieldParser for more information.

You need to add a reference to Microsoft.VisualBasic in the Add References .NET tab.

Delimit a string by character unless within quotation marks C#

Copied from my comment: Use an available csv parser like VisualBasic.FileIO.TextFieldParser or this or this.

As requested, here is an example for the TextFieldParser:

var allLineFields = new List<string[]>();
string sampleText = "Method,\"value1,value2\"";
var reader = new System.IO.StringReader(sampleText);
using (var parser = new Microsoft.VisualBasic.FileIO.TextFieldParser(reader))
parser.Delimiters = new string[] { "," };
parser.HasFieldsEnclosedInQuotes = true; // <--- !!!
string[] fields;
while ((fields = parser.ReadFields()) != null)

This list now contains a single string[] with two strings. I have used a StringReader because this sample uses a string, if the source is a file use a StreamReader(f.e. via File.OpenText).

Split string on comma and ignore comma in double quotes

I think you can use the regex,(?=(?:[^\"]*\"[^\"]*\")*[^\"]*$) from here: Splitting on comma outside quotes

You can test the pattern here: http://regexr.com/3cddl

Java code example:

public static void main(String[] args) {
String txt = "0, 2, 23131312,\"This, is a message\", 1212312";



