How do I force one field in Ruby's CSV output to be wrapped with double-quotes?
Well, there's a way to do it but it wasn't as clean as I'd hoped the CSV code could allow.
I had to subclass CSV, then override the CSV::Row.<<=
method and add another method forced_quote_fields=
to make it possible to define the fields I want to force-quoting on, plus pull two lambdas from other methods. At least it works for what I want:
require 'csv'
class MyCSV < CSV
def <<(row)
# make sure headers have been assigned
if header_row? and [Array, String].include? @use_headers.class
parse_headers # won't read data for Array or String
self << @headers if @write_headers
end
# handle CSV::Row objects and Hashes
row = case row
when self.class::Row then row.fields
when Hash then @headers.map { |header| row[header] }
else row
end
@headers = row if header_row?
@lineno += 1
@do_quote ||= lambda do |field|
field = String(field)
encoded_quote = @quote_char.encode(field.encoding)
encoded_quote +
field.gsub(encoded_quote, encoded_quote * 2) +
encoded_quote
end
@quotable_chars ||= encode_str("\r\n", @col_sep, @quote_char)
@forced_quote_fields ||= []
@my_quote_lambda ||= lambda do |field, index|
if field.nil? # represent +nil+ fields as empty unquoted fields
""
else
field = String(field) # Stringify fields
# represent empty fields as empty quoted fields
if (
field.empty? or
field.count(@quotable_chars).nonzero? or
@forced_quote_fields.include?(index)
)
@do_quote.call(field)
else
field # unquoted field
end
end
end
output = row.map.with_index(&@my_quote_lambda).join(@col_sep) + @row_sep # quote and separate
if (
@io.is_a?(StringIO) and
output.encoding != raw_encoding and
(compatible_encoding = Encoding.compatible?(@io.string, output))
)
@io = StringIO.new(@io.string.force_encoding(compatible_encoding))
@io.seek(0, IO::SEEK_END)
end
@io << output
self # for chaining
end
alias_method :add_row, :<<
alias_method :puts, :<<
def forced_quote_fields=(indexes=[])
@forced_quote_fields = indexes
end
end
That's the code. Calling it:
data = [
%w[1 2 3],
[ 2, 'two too', 3 ],
[ 3, 'two, too', 3 ]
]
quote_fields = [1]
puts "Ruby version: #{ RUBY_VERSION }"
puts "Quoting fields: #{ quote_fields.join(', ') }", "\n"
csv = MyCSV.generate do |_csv|
_csv.forced_quote_fields = quote_fields
data.each do |d|
_csv << d
end
end
puts csv
results in:
# >> Ruby version: 1.9.2
# >> Quoting fields: 1
# >>
# >> 1,"2",3
# >> 2,"two too",3
# >> 3,"two, too",3
Ruby CSV - Trying to wrap output in double quotes, getting Hello World instead of Hello World
This is correct. Quote characters are escaped in CSV files by doubling. And all fields that contain commas, newlines and/or quote characters need to be enclosed in quotes.
So the first quote starts a quoted field, the second and third quote encode the actual quote character.
becomes
Hello,"Field, with comma","2"" by 4""",123
Quote all fields in CSV output
Change
CSV::Writer.generate(@out)do |csv|
to
CSV::Writer.generate(@out, {:force_quotes=>true}) do |csv|
Writing to csv is adding quotes
If you get rid of the quotes then your output is no longer CSV. The CSV class can be instructed to use a different delimiter and will only quote if that delimiter is included in the input. For example:
require 'csv'
output = "This is a, ruby output"
File.open("output/abc.csv", "a+") do |io|
csv = CSV.new(io, col_sep: '^')
csv << [output, "the end"]
end
Output:
This is a, ruby output^the end
Ruby CSV.open need to remove quotes and null characters
As it is stated in the csv documentation you have to the set quote_char
to some character, and this character will always be used to quote empty fields.
It seems the only solution in this case is to remove used quote_chars
from the created csv file. You can do it like this:
quotedFile = File.read("#{file_path}"'file.tab')
unquotedFile = quotedFile.gsub("\0", "")
File.open("#{file_path}"'unquoted_file.tab',"w") { |file| file.puts replace }
I assume here that NULL's are the only escaped fields. If that's not the case use default quote_char: '"'
and gsub(',"",', '')
which should handle almost all possible cases of fields containing special characters.
But as you note that the results of your query are large it might be more practical to prepare the csv file on your own and avoid processing the outputs twice. You could simply write:
File.open("#{file_path}"'unquoted_file.tab',"w") do |file|
csv.puts ["Key","channel"]
series_1_results.each_hash do |series_1|
csv.puts ["#{series_1['key']},#{series_1['channel']}"]
end
end
Once more, you might need to handle fields with special characters.
Related Topics
Installing Ruby 2.3 on Wsl (Windows Subsystem for Linux)
How to Find a Model's Relationships
Stylesheet_Link_Tag :All Versus :Media =>All
I'm Getting "Found Character That Cannot Start Any Token While Scanning for the Next Token"
Pry Not Stopping When Called from a Ruby Script That Reads from Stdin
How to Do Advanced String Comparison in Ruby
Detect Rspec Test Failure on After Each Method
Phusion Passenger Is Throwing Errors After Upgrading Ruby and Rails Using Rvm
What Does Class Classname < ::Otherclassname Do in Ruby
Polymorphic Association with Multiple Associations on the Same Model
Share Models Between 2 Rails API's (Separate Applications)
Strong Parameters Require Multiple
Use Global or Constant Variable in Ruby/Rails
How to Use Truly Local Variables in Ruby Proc/Lambda