How to Skip Comma from CSV Using Double Quotes

Ignore Comma between double quotes while reading CSV file

You can fix this by replacing the Split function with the regex split function

Table.Rows.Add(row.Split(','));

Should be replaced with

Table.Rows.Add(Regex.Split(row, ",(?=(?:[^\"]*\"[^\"]*\")*[^\"]*$)"));

And add the assembly at the top

using System.Text.RegularExpressions;

This will fix your problem

Dealing with commas in a CSV file

As others have said, you need to escape values that include quotes. Here’s a little CSV reader in C♯ that supports quoted values, including embedded quotes and carriage returns.

By the way, this is unit-tested code. I’m posting it now because this question seems to come up a lot and others may not want an entire library when simple CSV support will do.

You can use it as follows:

using System;
public class test
{
public static void Main()
{
using ( CsvReader reader = new CsvReader( "data.csv" ) )
{
foreach( string[] values in reader.RowEnumerator )
{
Console.WriteLine( "Row {0} has {1} values.", reader.RowIndex, values.Length );
}
}
Console.ReadLine();
}
}

Here are the classes. Note that you can use the Csv.Escape function to write valid CSV as well.

using System.IO;
using System.Text.RegularExpressions;

public sealed class CsvReader : System.IDisposable
{
public CsvReader( string fileName ) : this( new FileStream( fileName, FileMode.Open, FileAccess.Read ) )
{
}

public CsvReader( Stream stream )
{
__reader = new StreamReader( stream );
}

public System.Collections.IEnumerable RowEnumerator
{
get {
if ( null == __reader )
throw new System.ApplicationException( "I can't start reading without CSV input." );

__rowno = 0;
string sLine;
string sNextLine;

while ( null != ( sLine = __reader.ReadLine() ) )
{
while ( rexRunOnLine.IsMatch( sLine ) && null != ( sNextLine = __reader.ReadLine() ) )
sLine += "\n" + sNextLine;

__rowno++;
string[] values = rexCsvSplitter.Split( sLine );

for ( int i = 0; i < values.Length; i++ )
values[i] = Csv.Unescape( values[i] );

yield return values;
}

__reader.Close();
}
}

public long RowIndex { get { return __rowno; } }

public void Dispose()
{
if ( null != __reader ) __reader.Dispose();
}

//============================================

private long __rowno = 0;
private TextReader __reader;
private static Regex rexCsvSplitter = new Regex( @",(?=(?:[^""]*""[^""]*"")*(?![^""]*""))" );
private static Regex rexRunOnLine = new Regex( @"^[^""]*(?:""[^""]*""[^""]*)*""[^""]*$" );
}

public static class Csv
{
public static string Escape( string s )
{
if ( s.Contains( QUOTE ) )
s = s.Replace( QUOTE, ESCAPED_QUOTE );

if ( s.IndexOfAny( CHARACTERS_THAT_MUST_BE_QUOTED ) > -1 )
s = QUOTE + s + QUOTE;

return s;
}

public static string Unescape( string s )
{
if ( s.StartsWith( QUOTE ) && s.EndsWith( QUOTE ) )
{
s = s.Substring( 1, s.Length - 2 );

if ( s.Contains( ESCAPED_QUOTE ) )
s = s.Replace( ESCAPED_QUOTE, QUOTE );
}

return s;
}

private const string QUOTE = "\"";
private const string ESCAPED_QUOTE = "\"\"";
private static char[] CHARACTERS_THAT_MUST_BE_QUOTED = { ',', '"', '\n' };
}

Python parse CSV ignoring comma with double-quotes

This should do:

lines = '''"AAA", "BBB", "Test, Test", "CCC"
"111", "222, 333", "XXX", "YYY, ZZZ"'''.splitlines()
for l in csv.reader(lines, quotechar='"', delimiter=',',
quoting=csv.QUOTE_ALL, skipinitialspace=True):
print l
>>> ['AAA', 'BBB', 'Test, Test', 'CCC']
>>> ['111', '222, 333', 'XXX', 'YYY, ZZZ']

How to escape comma and double quote at same time for CSV file?

There are several libraries. Here are two examples:


❐ Apache Commons Lang

Apache Commons Lang includes a special class to escape or unescape strings (CSV, EcmaScript, HTML, Java, Json, XML): org.apache.commons.lang3.StringEscapeUtils.

  • Escape to CSV

    String escaped = StringEscapeUtils
    .escapeCsv("I said \"Hey, I am 5'10\".\""); // I said "Hey, I am 5'10"."

    System.out.println(escaped); // "I said ""Hey, I am 5'10""."""
  • Unescape from CSV

    String unescaped = StringEscapeUtils
    .unescapeCsv("\"I said \"\"Hey, I am 5'10\"\".\"\"\""); // "I said ""Hey, I am 5'10""."""

    System.out.println(unescaped); // I said "Hey, I am 5'10"."

* You can download it from here.


❐ OpenCSV

If you use OpenCSV, you will not need to worry about escape or unescape, only for write or read the content.

  • Writing file:

    FileOutputStream fos = new FileOutputStream("awesomefile.csv"); 
    OutputStreamWriter osw = new OutputStreamWriter(fos, "UTF-8");
    CSVWriter writer = new CSVWriter(osw);
    ...
    String[] row = {
    "123",
    "John",
    "Smith",
    "39",
    "I said \"Hey, I am 5'10\".\""
    };
    writer.writeNext(row);
    ...
    writer.close();
    osw.close();
    os.close();
  • Reading file:

    FileInputStream fis = new FileInputStream("awesomefile.csv"); 
    InputStreamReader isr = new InputStreamReader(fis, "UTF-8");
    CSVReader reader = new CSVReader(isr);

    for (String[] row; (row = reader.readNext()) != null;) {
    System.out.println(Arrays.toString(row));
    }

    reader.close();
    isr.close();
    fis.close();

* You can download it from here.



Related Topics



Leave a reply



Submit