Ruby CSV - Illegal quoting in line 1. CSV::MalformedCSVError
I didn't find any way to read directly from remote file, if it contains BOM. So I use Tempfile file to create temporary file and then I do CSV.open with 'r:bom|utf-8':
doc = Document.find(doc_id)
path = "#{Rails.root.join('tmp')}/#{doc.name.split('.').first}_#{Time.now.to_i}.csv"
file = Tempfile.new(["#{doc.name.split('.').first}_#{Time.now.to_i}", '.csv'])
file.binmode
file << open(doc.file.url).read
file.close
CSV.open(path, 'w', headers: :first_row, col_sep: ';', row_sep: "\r\n", encoding: 'utf-8') do |csv|
CSV.open(file.path, 'r:bom|utf-8', headers: :first_row, col_sep: ';', quote_char: "\"", row_sep: "\r\n").each_with_index do |line, index|
# do something
end
end
Now, it seems to parse the file.
Illegal quoting in line 1 using Ruby CSV
Binary encoding of my file is below:
"\xFF\xFES\x00t\x00a\x00t\x00u\x00s\x00...
0xFF
0xFE
is the byte order mark for UTF-16LE.
You have to specify the encoding when processing this file with CSV#foreach
:
This method also understands an additional
:encoding
parameter that
you can use to specify the Encoding of the data in the file to be
read. You must provide this unless your data is in
Encoding::default_external()
. CSV will use this to determine how to
parse the data. You may provide a second Encoding to have the data
transcoded as it is read. For example,encoding: "UTF-32BE:UTF-8"
would read UTF-32BE data from the file but transcode it to UTF-8
before CSV parses it.
Furthermore you have to specify that a BOM is present. According to the IO#new
docs:
If “BOM|UTF-8”, “BOM|UTF-16LE” or “BOM|UTF16-BE” are (...) present, the BOM is stripped
Applied to your file and example:
CSV.foreach(file, col_sep: "\t", encoding: "BOM|UTF-16LE:UTF-8", headers: true) do |row|
# ...
end
CSV.read Illegal quoting in line x
I had this problem in a line like 123,456,a"b"c
The problem is the CSV parser is expecting "
, if they appear, to entirely surround the comma-delimited text.
Solution use a quote character besides "
that I was sure would not appear in my data:
CSV.read(filename, :quote_char => "|")
Rescue CSV::MalformedCsvError: Illegal quoting in line n
Your solution works. The expected result resides in the variable my_array.
CSV::MalformedCSVError: Illegal quoting in line 1 with SmarterCSV
This is due to illegal Unicode characters inside your file.
You can process file with Unicode characters with
f = File.open(file_path, "r:bom|utf-8"); data = SmarterCSV.process(f); f.close
here data will contain parsed data.
Also refer official documentation on this:https://github.com/tilo/smarter_csv#notes-about-file-encodings
Illegal Quoting error with Ruby CSV parsing
The problem causing the Illegal quoting
error was due to a Byte-Order-Mark (BOM) at the very beginning of the file. It didn't show up in editors, but the Ruby CSV lib was choking on it unless :encoding => 'bom|utf-8'
was set.
Once that was fixed, I still needed to remove all the '^M' characters by running %s/\r//g
in vim. And everything was working fine after that.
Related Topics
Delayedjob: "Job Failed to Load: Uninitialized Constant Syck::Syck"
Heroku Deplyoment Asset Precompiling Failed on Rails 6
Bug in Implemented Tagging System
Rails Strong Params - Using Fields from Has_Many Object
Ruby Net-Ssh Calling Bash Script with Interactive Prompts
How to Write the JSON Schema If JSON Has Multiple Data Set
Transfer Db from One Heroku App to Another Faster
Differencebetween Gsub and Sub Methods for Ruby Strings
Why Sinatra Request Takes Em Thread
Replicating Xml Request with Savon/Ruby
Attr_Accessor, Not Able to Access Property
Create and Initialize Instances of a Class with Sequential Names
How to Prevent My Users to Read My Ruby Code