Is there a way to include commas in CSV columns without breaking the formatting?
Enclose the field in quotes, e.g.
field1_value,field2_value,"field 3,value",field4, etc...
See wikipedia.
Updated:
To encode a quote, use "
, one double quote symbol in a field will be encoded as ""
, and the whole field will become """"
. So if you see the following in e.g. Excel:
---------------------------------------
| regular_value |,,,"| ,"", |""" |"|
---------------------------------------
the CSV file will contain:
regular_value,",,,""",","""",","""""""",""""
A comma is simply encapsulated using quotes, so ,
becomes ","
.
A comma and quote needs to be encapsulated and quoted, so ","
becomes ""","""
.
read csv file with a delimiter specified in the value itself
You cannot just simply use split
to parse CSV file. You can use a library (e.g. SuperCSV or OpenCSV) or you will have to follow all the CSV format rules. For instance, here you should ignore the delimiter (i.e. comma) between the double quotes (You can use regex for this).
Dealing with commas in a CSV file
As others have said, you need to escape values that include quotes. Here’s a little CSV reader in C♯ that supports quoted values, including embedded quotes and carriage returns.
By the way, this is unit-tested code. I’m posting it now because this question seems to come up a lot and others may not want an entire library when simple CSV support will do.
You can use it as follows:
using System;
public class test
{
public static void Main()
{
using ( CsvReader reader = new CsvReader( "data.csv" ) )
{
foreach( string[] values in reader.RowEnumerator )
{
Console.WriteLine( "Row {0} has {1} values.", reader.RowIndex, values.Length );
}
}
Console.ReadLine();
}
}
Here are the classes. Note that you can use the Csv.Escape
function to write valid CSV as well.
using System.IO;
using System.Text.RegularExpressions;
public sealed class CsvReader : System.IDisposable
{
public CsvReader( string fileName ) : this( new FileStream( fileName, FileMode.Open, FileAccess.Read ) )
{
}
public CsvReader( Stream stream )
{
__reader = new StreamReader( stream );
}
public System.Collections.IEnumerable RowEnumerator
{
get {
if ( null == __reader )
throw new System.ApplicationException( "I can't start reading without CSV input." );
__rowno = 0;
string sLine;
string sNextLine;
while ( null != ( sLine = __reader.ReadLine() ) )
{
while ( rexRunOnLine.IsMatch( sLine ) && null != ( sNextLine = __reader.ReadLine() ) )
sLine += "\n" + sNextLine;
__rowno++;
string[] values = rexCsvSplitter.Split( sLine );
for ( int i = 0; i < values.Length; i++ )
values[i] = Csv.Unescape( values[i] );
yield return values;
}
__reader.Close();
}
}
public long RowIndex { get { return __rowno; } }
public void Dispose()
{
if ( null != __reader ) __reader.Dispose();
}
//============================================
private long __rowno = 0;
private TextReader __reader;
private static Regex rexCsvSplitter = new Regex( @",(?=(?:[^""]*""[^""]*"")*(?![^""]*""))" );
private static Regex rexRunOnLine = new Regex( @"^[^""]*(?:""[^""]*""[^""]*)*""[^""]*$" );
}
public static class Csv
{
public static string Escape( string s )
{
if ( s.Contains( QUOTE ) )
s = s.Replace( QUOTE, ESCAPED_QUOTE );
if ( s.IndexOfAny( CHARACTERS_THAT_MUST_BE_QUOTED ) > -1 )
s = QUOTE + s + QUOTE;
return s;
}
public static string Unescape( string s )
{
if ( s.StartsWith( QUOTE ) && s.EndsWith( QUOTE ) )
{
s = s.Substring( 1, s.Length - 2 );
if ( s.Contains( ESCAPED_QUOTE ) )
s = s.Replace( ESCAPED_QUOTE, QUOTE );
}
return s;
}
private const string QUOTE = "\"";
private const string ESCAPED_QUOTE = "\"\"";
private static char[] CHARACTERS_THAT_MUST_BE_QUOTED = { ',', '"', '\n' };
}
Apache commons CSV | How can I ignore/include semicolon, comma in a field?
You can configure TAB as delimiter instead of using DEFAULT delimiter -
CSVPrinter printer = new CSVPrinter(writer, CSVFormat.TDF.withHeader(HEADERS));
https://commons.apache.org/proper/commons-csv/apidocs/org/apache/commons/csv/CSVFormat.html#TDF
How should I escape commas and speech marks in CSV files so they work in Excel?
We eventually found the answer to this.
Excel will only respect the escaping of commas and speech marks if the column value is NOT preceded by a space. So generating the file without spaces like this...
Reference,Title,Description
1,"My little title","My description, which may contain ""speech marks"" and commas."
2,"My other little title","My other description, which may also contain ""speech marks"" and commas."
... fixed the problem. Hope this helps someone!
Related Topics
How to Send Date in Rest API in Post Method
Is There an Invisible Character That Is Not Regarded as Whitespace
How to Make Program to Continue Running After Exception
403 Forbidden When I Try to Post to My Spring API
Find Difference Between Two Strings
Rounding to the Nearest Hundered-Thousandths
How to Split a String Between Letters and Digits (Or Between Digits and Letters)
Cannot Construct Instance of 'Java.Time.Localdate' - Spring Boot, Elasticseach, Jackson
How to Download a Pdf File in Chrome Using Selenium Webdriver
How to Upload a File and Json Data in Postman
How to Determine If a List of String Contains Null or Empty Elements
How to Pass JavaScript Values to Scriptlet in Jsp
How to Print Out All the Elements of a List in Java
Remove Duplicates from a List of Objects Based on Property in Java 8
How to Fill Hashmap from Java Property File With Spring @Value