How to Force One Field in Ruby's CSV Output to Be Wrapped with Double-Quotes

How do I force one field in Ruby's CSV output to be wrapped with double-quotes?

Well, there's a way to do it but it wasn't as clean as I'd hoped the CSV code could allow.

I had to subclass CSV, then override the CSV::Row.<<= method and add another method forced_quote_fields= to make it possible to define the fields I want to force-quoting on, plus pull two lambdas from other methods. At least it works for what I want:

require 'csv'

class MyCSV < CSV
def <<(row)
# make sure headers have been assigned
if header_row? and [Array, String].include? @use_headers.class
parse_headers # won't read data for Array or String
self << @headers if @write_headers
end

# handle CSV::Row objects and Hashes
row = case row
when self.class::Row then row.fields
when Hash then @headers.map { |header| row[header] }
else row
end

@headers = row if header_row?
@lineno += 1

@do_quote ||= lambda do |field|
field = String(field)
encoded_quote = @quote_char.encode(field.encoding)
encoded_quote +
field.gsub(encoded_quote, encoded_quote * 2) +
encoded_quote
end

@quotable_chars ||= encode_str("\r\n", @col_sep, @quote_char)
@forced_quote_fields ||= []

@my_quote_lambda ||= lambda do |field, index|
if field.nil? # represent +nil+ fields as empty unquoted fields
""
else
field = String(field) # Stringify fields
# represent empty fields as empty quoted fields
if (
field.empty? or
field.count(@quotable_chars).nonzero? or
@forced_quote_fields.include?(index)
)
@do_quote.call(field)
else
field # unquoted field
end
end
end

output = row.map.with_index(&@my_quote_lambda).join(@col_sep) + @row_sep # quote and separate
if (
@io.is_a?(StringIO) and
output.encoding != raw_encoding and
(compatible_encoding = Encoding.compatible?(@io.string, output))
)
@io = StringIO.new(@io.string.force_encoding(compatible_encoding))
@io.seek(0, IO::SEEK_END)
end
@io << output

self # for chaining
end
alias_method :add_row, :<<
alias_method :puts, :<<

def forced_quote_fields=(indexes=[])
@forced_quote_fields = indexes
end
end

That's the code. Calling it:

data = [ 
%w[1 2 3],
[ 2, 'two too', 3 ],
[ 3, 'two, too', 3 ]
]

quote_fields = [1]

puts "Ruby version: #{ RUBY_VERSION }"
puts "Quoting fields: #{ quote_fields.join(', ') }", "\n"

csv = MyCSV.generate do |_csv|
_csv.forced_quote_fields = quote_fields
data.each do |d|
_csv << d
end
end

puts csv

results in:

# >> Ruby version:   1.9.2
# >> Quoting fields: 1
# >>
# >> 1,"2",3
# >> 2,"two too",3
# >> 3,"two, too",3

Ruby CSV - Trying to wrap output in double quotes, getting Hello World instead of Hello World

This is correct. Quote characters are escaped in CSV files by doubling. And all fields that contain commas, newlines and/or quote characters need to be enclosed in quotes.

So the first quote starts a quoted field, the second and third quote encode the actual quote character.

Excel screenshot

becomes

Hello,"Field, with comma","2"" by 4""",123

Quote all fields in CSV output

Change

CSV::Writer.generate(@out)do |csv|

to

CSV::Writer.generate(@out, {:force_quotes=>true}) do |csv|

Writing to csv is adding quotes

If you get rid of the quotes then your output is no longer CSV. The CSV class can be instructed to use a different delimiter and will only quote if that delimiter is included in the input. For example:

require 'csv'
output = "This is a, ruby output"
File.open("output/abc.csv", "a+") do |io|
csv = CSV.new(io, col_sep: '^')
csv << [output, "the end"]
end

Output:

This is a, ruby output^the end

Ruby CSV.open need to remove quotes and null characters

As it is stated in the csv documentation you have to the set quote_char to some character, and this character will always be used to quote empty fields.

It seems the only solution in this case is to remove used quote_chars from the created csv file. You can do it like this:

quotedFile = File.read("#{file_path}"'file.tab')
unquotedFile = quotedFile.gsub("\0", "")
File.open("#{file_path}"'unquoted_file.tab',"w") { |file| file.puts replace }

I assume here that NULL's are the only escaped fields. If that's not the case use default quote_char: '"' and gsub(',"",', '') which should handle almost all possible cases of fields containing special characters.

But as you note that the results of your query are large it might be more practical to prepare the csv file on your own and avoid processing the outputs twice. You could simply write:

File.open("#{file_path}"'unquoted_file.tab',"w") do |file|
csv.puts ["Key","channel"]
series_1_results.each_hash do |series_1|
csv.puts ["#{series_1['key']},#{series_1['channel']}"]
end
end

Once more, you might need to handle fields with special characters.



Related Topics



Leave a reply



Submit