Java: CSV File Easy Read/Write

Java: CSV File Easy Read/Write

Rather than reinventing the wheel you could have a look at OpenCSV which supports reading and writing of CSV files. Here are examples of reading & writing

Java: CSV file read & write

Suggest you use one of the existing CSV parser such as Commons CSV or Super CSV instead of reinventing the wheel. Should make your life a lot easier.

Read and Write CSV File using Java

Create a class to store the header values, and store it in the list.
Iterate over the list to save the results.

The currently used map can only store 2 values (which it is storing the header value (name its corresponding value)

map.put(d[0], d[1]);
here d[0] will be header1 and d[1] will be 4 (but we want only 4 from here)

    class Headervalues {
String[] header = new String[3];
}

public void readLogFile() throws Exception
{
List<HeaderValues> list = new ArrayList<>();
String currentLine = "";
BufferedReader reader = new BufferedReader(new FileReader(file(false)));
while ((currentLine = reader.readLine()) != null)
{
if (currentLine.contains("2016") && currentLine.contains("helloworld"))
{

String nextBlock = replaceAll(currentLine.substring(22, currentLine.length());

String[] data = nextBlock.split(";");
HeaderValues headerValues = new HeaderValues();
//Assuming data.length will always be 3.
for (int i = 0, max = data.length; i < max; i++)
{
String[] d = data[i].split("=");
//Assuming split will always have size 2
headerValues.header[i] = d[1];
}
list.add(headerValues)
}
}
}
reader.close();
}
public void writeContentToCsv() throws Exception
{
FileWriter writer = new FileWriter(".../file_new.csv");
for (HeaderValues value : headerValues)
{
writer.append(value.header[0]).append(";").append(value.header[1]).append(";").append(value.header[2]);
}
writer.close();
}

Fastest way to read a CSV file java

tl;dr

Reading a 20 MB CSV file, and instantiating an object per row, takes less than 1 second in total elapsed time.

Details

You did not define the term “slow”. So I did an experiment, a casual benchmark test.

First we create a 20 MB file of 40,000 Person records. Each Person holds a first & last name in French, a UUID, and some arbitrary text as a description. The data is written as four columns in a CSV file in UTF-8. I used the Apache Commons CSV library to write and read.

Secondly, this written file is read. Each row of data is read into memory, then used to instantiate and collect a Person object.

Reading this file, and instantiating Person object for each row took less than one second in total elapsed time. Each row takes about 20K nanoseconds. Actually, this includes reading the file twice, as we do a scan to count the number of rows of data to set initial capacity of the collected instances. Also, we are parsing a hex string input into the 128-bit value of a UUID, so we have some time spent on data-processing (not just reading).

For Java 16+, define Person class as a record. We override toString to avoid printing out the long description content.

record Person ( String givenName , String surname , UUID id , String description ) 
{
static public String LOREM_IPSUM = "Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.";

@Override
public String toString ()
{
return "Person{ " +
"givenName='" + givenName + '\'' +
" | surname='" + surname + '\'' +
" | id='" + id + '\'' +
" }";
}
}

For earlier Java, write a conventional Person class.

package work.basil.example;

import java.util.UUID;

public class Person
{
// Static
static public String LOREM_IPSUM = "Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.";

// Member variables.
public String givenName, surname, description;
public UUID id;

public Person ( String givenName , String surname , UUID id , String description )
{
this.givenName = givenName;
this.surname = surname;
this.id = id;
this.description = description ;
}

@Override
public String toString ()
{
return "Person{ " +
"givenName='" + givenName + '\'' +
" | surname='" + surname + '\'' +
" | id='" + id + '\'' +
" }";
}
}

And here is the complete app that writes and then reads the 20 MB file. Please study and critique, as I whipped this up in a jiffy. I’ve not double-checked my work.

You will find a write method, and a read method. The main method calls both, and tracks time.

package work.basil.example;

import org.apache.commons.csv.CSVFormat;
import org.apache.commons.csv.CSVPrinter;
import org.apache.commons.csv.CSVRecord;

import java.io.BufferedReader;
import java.io.IOException;
import java.nio.charset.StandardCharsets;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;
import java.time.Duration;
import java.time.Instant;
import java.time.temporal.ChronoUnit;
import java.util.ArrayList;
import java.util.List;
import java.util.UUID;
import java.util.concurrent.ThreadLocalRandom;

public class CsvSpeed
{
public List < Person > read ( Path path )
{
// TODO: Add a check for valid file existing.

List < Person > list = List.of(); // Default to empty list.
try
{
// Prepare list.
int initialCapacity = ( int ) Files.lines( path ).count();
list = new ArrayList <>( initialCapacity );

// Read CSV file. For each row, instantiate and collect `DailyProduct`.
BufferedReader reader = Files.newBufferedReader( path );
Iterable < CSVRecord > records = CSVFormat.RFC4180.withFirstRecordAsHeader().parse( reader );
for ( CSVRecord record : records )
{
String givenName = record.get( "givenName" );
String surname = record.get( "surname" );
UUID id = UUID.fromString( record.get( "id" ) );
String description = record.get( "description" );
// Instantiate `Person` object, and collect it.
Person person = new Person( givenName , surname , id , description );
list.add( person );
}
} catch ( IOException e )
{
e.printStackTrace();
}
return list;
}

public void write ( final Path path )
{
ThreadLocalRandom random = ThreadLocalRandom.current();
try ( final CSVPrinter printer = CSVFormat.RFC4180.withHeader( "givenName" , "surname" , "id" , "description" ).print( path , StandardCharsets.UTF_8 ) ; )
{
int limit = 40_000; // 40_000 yields about 20 MB of data.
List < String > givenNames = List.of( "Adrien" , "Aimon" , "Alerion" , "Alexis" , "Alezan" , "Ancil" , "Andre" , "Antoine" , "Archard" , "Aurélien" , "Averill" , "Baptiste" , "Barnard" , "Bartelemy" , "Bastien" , "Baylee" , "Beale" , "Beau" , "Beaumont" , "Beauregard" , "Bellamy" , "Berger" , "Blaize" , "Blondel" , "Boyce" , "Bruce" , "Brunelle" , "Brys" , "Burcet" , "Burnell" , "Burrell" , "Byron" , "Canaan" , "Carden" , "Carolas" , "Cavell" , "Chace" , "Chanler" , "Chante" , "Chappel" , "Charles" , "Chasen" , "Chason" , "Chemin" , "Chene" , "Cher" , "Chevalier" , "Cheyne" , "Clément" , "Clemence" , "Corbin" , "Coty" , "Cygne" , "Damien" , "Dandre" , "Dariel" , "Darl" , "Dauphine" , "Davet" , "Dax" , "Dean" , "Delice" , "Delmon" , "Destin" , "Dominique" , "Donatien" , "Duke" , "Eliott" , "Elroy" , "Enzo" , "Erwan" , "Etalon" , "Ethan" , "Fabron" , "Ferrand" , "Filberte" , "Florent" , "Florian" , "Fontaine" , "Forest" , "Fortune" , "Franchot" , "Francois" , "Fraser" , "Frayne" , "Gaëtan" , "Gabin" , "Gage" , "Gaige" , "Garland" , "Garner" , "Gaston" , "Gauge" , "Gaylord" , "Germain" , "Germaine" , "German" , "Gervaise" , "Giles" , "Gilles" , "Gitan" , "Grosvener" , "Guifford" , "Guion" , "Guy" , "Guzman" , "Henri" , "Holland" , "Hugo" , "Hugues" , "Hyacinthe" , "Jérémy" , "Jacquan" , "Jacques" , "Jacquez" , "Janvier" , "Jardan" , "Jay" , "Jaye" , "Jehan" , "Jemond" , "Jocquez" , "Jonathan" , "Jules" , "Julien" , "Justus" , "Karoly" , "Lado" , "Lafayette" , "Lamond" , "Lancelin" , "Landis" , "Landry" , "Laron" , "Larrimore" , "Laurent" , "LaValle" , "Leandre" , "Leggett" , "Leonce" , "Leron" , "Leverett" , "Lilian" , "Loïc" , "Lorenzo" , "Louis" , "Lowell" , "Luc" , "Lucien" , "Lukas" , "Macaire" , "Mace" , "Mahieu" , "Maison" , "Malleville" , "Manneville" , "Mantel" , "Marc" , "Marcel" , "Marion" , "Marius" , "Markez" , "Markis" , "Marmion" , "Marquis" , "Marquise" , "Marshall" , "Martial" , "Maslin" , "Mason" , "Matheo" , "Mathias" , "Mathys" , "Matthieu" , "Maxence" , "Mayson" , "Mehdi" , "Merle" , "Merville" , "Montague" , "Montaigu" , "Monte" , "Montgomery" , "Montreal" , "Montrel" , "Moore" , "Morel" , "Mortimer" , "Nerville" , "Neuveville" , "Nicolas" , "Noë" , "Noah" , "Noe" , "Norman" , "Norville" , "Nouel" , "Olivier" , "Onfroi" , "Paien" , "Parfait" , "Parnell" , "Pascal" , "Patrice" , "Paul" , "Peppin" , "Percival" , "Percy" , "Pernell" , "Peverell" , "Philipe" , "Pierpont" , "Pierre" , "Pomeroy" , "Prewitt" , "Purvis" , "Quennell" , "Quentin" , "Quincey" , "Quincy" , "Quintin" , "Rémi" , "Rafaelle" , "Ranger" , "Raoul" , "Raphaël" , "Rapier" , "Rawlins" , "Ray" , "Raynard" , "Remi" , "René" , "Renard" , "Rene" , "Reule" , "Reynard" , "Robin" , "Romain" , "Rondel" , "Roy" , "Royal" , "Ruff" , "Rush" , "Russel" , "Rustin" , "Sabastien" , "Sacha" , "Salomon" , "Samuel" , "Satordi" , "Saville" , "Scoville" , "Sebastien" , "Sennett" , "Severin" , "Shant" , "Shantae" , "Sidney" , "Siffre" , "Simeon" , "Simon" , "Sinclair" , "Sofiane" , "Somer" , "Stephane" , "Sully" , "Sydney" , "Sylvain" , "Talbot" , "Talon" , "Telford" , "Tempest" , "Teppo" , "Théo" , "Thayer" , "Thibault" , "Thibaut" , "Thiery" , "Tiennan" , "Tiennot" , "Titouan" , "Toussaint" , "Travaris" , "Tyson" , "Urson" , "Vachel" , "Valentin" , "Valere" , "Vallis" , "Verdun" , "Victoir" , "Victor" , "Waltier" , "William" , "Wyatt" , "Yanis" , "Yann" , "Yves" , "Yvon" , "Zosime" , "Abrial" , "Abrielle" , "Abril" , "Adele" , "Alair" , "Alerion" , "Amee" , "Angelique" , "Annette" , "Antonella" , "Arian" , "Ariane" , "Armandina" , "Aubree" , "Aubrielle" , "Audra" , "Avril" , "Bella" , "Berneta" , "Bette" , "Blaise" , "Blanche" , "Blasa" , "Bonte" , "Brie" , "Brienne" , "Brigit" , "Cachay" , "Calice" , "Camille" , "Camylle" , "Caprice" , "Caressa" , "Caroline" , "Catin" , "Celesta" , "Celeste" , "Cera" , "Cerise" , "Chablis" , "Chalice" , "Chambray" , "Champagne" , "Chandell" , "Chaney" , "Chantal" , "Chante" , "Chanterelle" , "Chantile" , "Chantilly" , "Chantrice" , "Charla" , "Charlotte" , "Charmane" , "Chaton" , "Chemin" , "Chenetta" , "Cher" , "Chere" , "Cheri" , "Cheryl" , "Christine" , "Cidney" , "Cinderella" , "Claire" , "Claudette" , "Colette" , "Cordelle" , "Cydnee" , "Daeja" , "Daija" , "Daja" , "Damzel" , "Darelle" , "Darlene" , "Darselle" , "Dejanelle" , "Deleena" , "Delice" , "Demeri" , "Deni" , "Denise" , "Desgracias" , "Desire" , "Desiree" , "Destanee" , "Destiny" , "Dior" , "Domanique" , "Dominique" , "Elaina" , "Elaine" , "Elayna" , "Elise" , "Eloisa" , "Elyse" , "Emeline" , "Emmaline" , "Emmeline" , "Estella" , "Estrella" , "Etiennette" , "Evette" , "Fabienne" , "Fabrienne" , "Fanchon" , "Fancy" , "Fawna" , "Fayana" , "Fayette" , "Fifi" , "Fleur" , "Fleurette" , "Fontanna" , "Fosette" , "Francine" , "Frederique" , "Gabriel" , "Gabriele" , "Gabrielle" , "Gaby" , "Garcelle" , "Gena" , "Genie" , "Georgette" , "Germaine" , "Gervaise" , "Gitana" , "Harriet" , "Heloisa" , "Holland" , "Honnetta" , "Isabelle" , "Ivette" , "Ivonne" , "Jacqueena" , "Jacquetta" , "Jacquiline" , "Jacyline" , "Jaime" , "Jakqueline" , "Janeen" , "Janelly" , "Janina" , "Janiqua" , "Janique" , "Jannnelle" , "Jaquita" , "Jardena" , "Jeanetta" , "Jermaine" , "Jessamine" , "Jewel" , "Jewell" , "Joli" , "Jolie" , "Josephine" , "Jozephine" , "Julieta" , "Karessa" , "Karmaine" , "Klara" , "Laine" , "Lanelle" , "Laramie" , "Layne" , "Layney" , "Leala" , "Leonette" , "Lissette" , "Lizette" , "Lourdes" , "Lucienne" , "Ly" , "Lyla" , "Lysette" , "Madelaine" , "Malerie" , "Manette" , "Marais" , "Marcelle" , "Marché" , "Mardi" , "Margo" , "Marguerite" , "Marie" , "Marie Claude" , "Marie Frances" , "Marie Joelle" , "Marie Pascale" , "Marie Sophie" , "Marjolaine" , "Marquise" , "Marvella" , "Mathieu" , "Matisse" , "Maurelle" , "Maurissa" , "Mavis" , "Melisande" , "Michelle" , "Miette" , "Mignon" , "Mimi" , "Mirya" , "Monet" , "Moniqua" , "Monteen" , "Musetta" , "Myrlie" , "Nadeen" , "Nadia" , "Nadiyah" , "Naeva" , "Nanon" , "Natalle" , "Naudia" , "Nettie" , "Nicholas" , "Nicki" , "Nicky" , "Nicole" , "Nicolette" , "Nicolina" , "Nicolle" , "Nikolette" , "Ninette" , "Ninon" , "Noelle" , "Nycole" , "Odelette" , "Opaline" , "Orane" , "Orva" , "Page" , "Parisa" , "Parnel" , "Parris" , "Patrice" , "Peridot" , "Pippi" , "Prairie" , "Rachele" , "Rachelle" , "Racquel" , "Raphaelle" , "Raquelle" , "Remi" , "Renée" , "Renea" , "Renelle" , "Renita" , "Risette" , "Rochelle" , "Romy" , "Rosabel" , "Rosiclara" , "Ruba" , "Russhell" , "Saleena" , "Salina" , "Satin" , "Sedona" , "Serene" , "Shandelle" , "Shanta" , "Shante" , "Shariah" , "Sharita" , "Sharleen" , "Sheree" , "Shereen" , "Sherell" , "Sherice" , "Sherry" , "Sidnee" , "Sidney" , "Sidnie" , "Sidonie" , "Sinclaire" , "Solange" , "Solen" , "Sorrel" , "Suzette" , "Sydnee" , "Sydney" , "Tallis" , "Tempest" , "Toinette" , "Turquoise" , "Veronique" , "Vignette" , "Villette" , "Violeta" , "Virginie" , "Voleta" , "Vonny" );
List < String > surnames = List.of( "Arceneau" , "Aucoin" , "Babin" , "Babineaux" , "Benoit" , "Bergeron" , "Bernard" , "Bertrand" , "Bessette" , "Blanc" , "Blanchard" , "Bonnet" , "Boucher" , "Bourg" , "Bourque" , "Boutin" , "Bouvier" , "Braud" , "Broussard" , "Brun" , "Chevalier" , "David" , "Depaul" , "Desmarais" , "Disney" , "Dubois" , "Dupont" , "Dupuis" , "Durand" , "Fortescue" , "Fournier" , "Garnier" , "Gaudet" , "Gillet" , "Gillette" , "Girard" , "Gravois" , "Grosvenor" , "Lambert" , "Landry" , "Laroche" , "Laurent" , "Lefevre" , "Leroy" , "Leveque" , "Lisle" , "Martin" , "Michel" , "Molyneux" , "Moreau" , "Morel" , "Neville" , "Pelletier" , "Petit" , "Prideux" , "Renard" , "Richard" , "Robert" , "Rousseau" , "Roux" , "Rufus" , "Simon" , "Thomas" );
for ( int i = 1 ; i <= limit ; i++ )
{
String givenName = givenNames.get( random.nextInt( 0 , givenNames.size() ) );
String surname = surnames.get( random.nextInt( 0 , surnames.size() ) );
UUID id = UUID.randomUUID();
String description = Person.LOREM_IPSUM;
printer.printRecord( givenName , surname , id , description );
}
} catch ( IOException e )
{
e.printStackTrace();
}
}

public static void main ( final String[] args )
{
// Launch the app.
CsvSpeed app = new CsvSpeed();

// Write.
String when = Instant.now().truncatedTo( ChronoUnit.SECONDS ).toString().replace( ":" , "•" );
Path pathOutput = Paths.get( "/Users/basilbourque/persons.csv" );
app.write( pathOutput );
System.out.println( "Writing file: " + pathOutput );

// Read.
long start = System.nanoTime();
Path pathInput = Paths.get( "/Users/basilbourque/persons.csv" );
List < Person > list = app.read( pathInput );
long stop = System.nanoTime();

// Time.
long elapsed = ( stop - start );
Duration d = Duration.ofNanos( elapsed );
System.out.println( "Reading elapsed: " + d );
System.out.println( "Reading took nanos per row: " + ( elapsed / list.size() ) );
System.out.println( "nanos elapsed: " + elapsed + " | list.size: " + list.size() );
}
}

When run:

Writing file: /Users/basilbourque/persons.csv

Reading elapsed: PT0.857816234S

Reading took nanos per row: 21445

nanos elapsed: 857816234 | list.size: 40000

Technology stack:

  • Java 11.0.2 — Zulu by Azul Systems (built from OpenJDK)
  • Run inside IntelliJ 2019.1
  • macOS Mojave
  • MacBook Pro (Retina, 15-inch, Late 2013)
  • Processor: 2.3 GHz Intel Core i7 (4 cores, 8 hyper)
  • 16 GB 1600 MHz DDR3
  • Storage: Solid-state built-in by Apple

Can you recommend a Java library for reading (and possibly writing) CSV files?

We have used
http://opencsv.sourceforge.net/
with good success

I also came across another question with good links:
Java lib or app to convert CSV to XML file?

CSV file with ID as first item is corrupt in Excel

Basically it's because MS Excel can't decide how to open the file with such content.

When you put ID as the first character in a Spreadsheet type file, it matches the specification of a SYLK file and MS Excel (and potentially other Spreadsheet Apps) try to open it as a SYLK file. But at the same time, it does not meet the complete specification of a SYLK file since rest of the values in the file are comma separated. Hence, the error is shown.

To solve the issue, change "ID" to "id" and it should work as expected.

Sample Image

This is weird. But, yeah!

Also trying to minimize file access by using file object less.

I tested and the code below works perfect.

import java.io.File;
import java.io.FileNotFoundException;
import java.io.PrintWriter;

public class CsvWriter {
public static void main(String[] args) {

try (PrintWriter writer = new PrintWriter("test.csv")) {

StringBuilder sb = new StringBuilder();
sb.append("id");
sb.append(',');
sb.append("Name");
sb.append('\n');

sb.append("1");
sb.append(',');
sb.append("Prashant Ghimire");
sb.append('\n');

writer.write(sb.toString());

System.out.println("done!");

} catch (FileNotFoundException e) {
System.out.println(e.getMessage());
}

}
}


Related Topics



Leave a reply



Submit