FasterCSV: several separators
Solution 1:
One simple way to do it is to let the user select with a drop-down which separator they use in their CSV file, and then you just set that value in the CSV.read()
call. But I guess you want it automatic. :-)
Solution 2:
You can read-in the first line of the CSV file with regular File.read()
and analyze it by matching the first line against /,/
and then against /\t/
... depending on which RegExp matches, you select the separator in the CSV.read()
call to the according (single) separator. Then you read in the file with CSV.read(..., :col_sep => single_separator )
accordingly.
But Beware:
At first it looks nice and elegant to want to use ",\t"
as the separator in the method call to allow both -- but please note this would introduce a possible nasty bug!
If a CVS file would contain both tabs and commas by accident or by chance ... what do you do then?
Separate on both? How can you be sure? I think that would be a mistake, because CSV separators don't appear "mixed" like this in regular CSV files -- it's always either ','
or "\t"
So I think you should not use ",\t"
-- that could be causing huge problems, and that's probably the reason why they did not implement / allow the col_sep
option to accept a RegExp.
Rails: Use more than 1 col_sep
col_sep only accepts one value. You can see examples of how it's used here:
http://rxr.whitequark.org/mri/source/lib/csv.rb
(lines 1654 and 1803 are a couple examples)
One workaround could be replacing all instances of one separator value with another by using something like gsub. Not the silver bullet you were hoping for, but depending on your requirements it could do the trick!
FasterCSV default options and their usage
FasterCSV has replaced the former CSV module in the standard library and is since then renamed to 'CSV'. Have a look at the new method for the options.
How do I split treat a string (not a file) as a line in a CSV file and parse the string?
If you have arbitrary string data you want to parse as CSV you can just use parse
. No need for a temporary file:
require 'csv'
commas = %Q[a,b,"c,d"]
CSV.parse(commas)
# => [["a", "b", "c,d"]]
tabs = %Q[a\tb\t"c\td"]
CSV.parse(tabs, col_sep: "\t")
# => [["a", "b", "c\td"]]
The col_sep
option allows you to specify what separator is used.
How parse the data from TXT file with tab separator?
Here's one way to do it. We go to lower level, using shift
to parse each row and then silent the MalformedCSVError
exception, continuing with the next iteration. The problem with this is the loop doesn't look so nice. If anyone can improve this, you're welcome to edit the code.
FasterCSV.open(filename, :quote_char => '"', :col_sep => "\t", :headers => true) do |csv|
row = true
while row
begin
row = csv.shift
break unless row
# Do things with the row here...
rescue FasterCSV::MalformedCSVError
next
end
end
end
Related Topics
Why Doesn't Array Override the Triple Equal Sign Method in Ruby
Rails Postgresql Multiple Schemas and the Same Table Name
Rails: Organizing Models in Subfolders Having Warning: Toplevel Constant a Referenced by B::A
Ruby - Convert Integer to String
How to Effectively Force Minitest to Run My Tests in Order
Ssl_Connect Syscall Returned=5 Errno=0 State=Sslv3 Read Server Hello a (Openssl::Ssl::Sslerror)
How to Speed Up Ruby/Rake Task
How to Get the File Creation Time in Ruby on Windows
-Bash: /Usr/Local/Bin/Heroku: /Usr/Local/Bin/Ruby: Bad Interpreter: No Such File or Directory
How to Access a Class Variable from the Outside in Ruby
Initializing a Hash with Empty Array Unexpected Behaviour
Chromedriver Devtools Port Number Error
Ruby Popen3 -- How to Repeatedly Write to Stdin & Read Stdout Without Re-Opening Process
Ruby Parenthesis Syntax Exception with I++ ++I
Restart Rails Server Automatically After Every Change in Controllers
Calling a Method of a Ruby Singleton Without the Reference of 'Instance'