CSV File with "Id" as First Item Is Corrupt in Excel

CSV file with ID as first item is corrupt in Excel

Basically it's because MS Excel can't decide how to open the file with such content.

When you put ID as the first character in a Spreadsheet type file, it matches the specification of a SYLK file and MS Excel (and potentially other Spreadsheet Apps) try to open it as a SYLK file. But at the same time, it does not meet the complete specification of a SYLK file since rest of the values in the file are comma separated. Hence, the error is shown.

To solve the issue, change "ID" to "id" and it should work as expected.

Sample Image

This is weird. But, yeah!

Also trying to minimize file access by using file object less.

I tested and the code below works perfect.

import java.io.File;
import java.io.FileNotFoundException;
import java.io.PrintWriter;

public class CsvWriter {
public static void main(String[] args) {

try (PrintWriter writer = new PrintWriter("test.csv")) {

StringBuilder sb = new StringBuilder();
sb.append("id");
sb.append(',');
sb.append("Name");
sb.append('\n');

sb.append("1");
sb.append(',');
sb.append("Prashant Ghimire");
sb.append('\n');

writer.write(sb.toString());

System.out.println("done!");

} catch (FileNotFoundException e) {
System.out.println(e.getMessage());
}

}
}

Problems with creating a CSV file using Excel

Seems like in the GDP column there are decimal values from the first column in the .csv file and first digits of the second column. There's either something wrong with the .csv you created, but more probably you need to specify separator in the pandas.read_csv line. Also, add header=None, to make sure you don't lose the first line of the file (i.e. it will get replaced by colnames).

Try this:

import pandas
colnames = ['GDP', 'Unemployment', 'CPI', 'HousePricing']
data = pandas.read_csv('Dane_2.csv', names = colnames, header=None, sep=';')
GDP = data.GDP.tolist()
print(GDP)

Weird characters added to first column name after reading a toad-exported csv file

Try this:

d <- read.csv("test_file.csv", fileEncoding="UTF-8-BOM")

This works in R 3.0.0+ and removes the BOM if present in the file (common for files generated from Microsoft applications: Excel, SQL server)

R's read.csv prepending 1st column name with junk text

You've got a Unicode UTF-8 BOM at the start of the file:

http://en.wikipedia.org/wiki/Byte_order_mark

A text editor or web browser interpreting the text as ISO-8859-1 or
CP1252 will display the characters  for this

R is giving you the ï and then converting the other two into dots as they are non-alphanumeric characters.

Here:

http://r.789695.n4.nabble.com/Writing-Unicode-Text-into-Text-File-from-R-in-Windows-td4684693.html

Duncan Murdoch suggests:

You can declare a file to be in encoding "UTF-8-BOM" if you want to
ignore a BOM on input

So try your read.csv with fileEncoding="UTF-8-BOM" or persuade your SQL wotsit to not output a BOM.

Otherwise you may as well test if the first name starts with ï.. and strip it with substr (as long as you know you'll never have a column that does start like that genuinely...)



Related Topics



Leave a reply



Submit