Parse CSV File with Header Fields as Attributes for Each Row

Parse CSV file with header fields as attributes for each row

Using Ruby 1.9 and above, you can get a an indexable object:

CSV.foreach('my_file.csv', :headers => true) do |row|
puts row['foo'] # prints 1 the 1st time, "blah" 2nd time, etc
puts row['bar'] # prints 2 the first time, 7 the 2nd time, etc
end

It's not dot syntax but it is much nicer to work with than numeric indexes.

As an aside, for Ruby 1.8.x FasterCSV is what you need to use the above syntax.

Parse CSV into multiple lines where each value is printed after its header


require 'csv'
lineN = 0

CSV.read( filename ).each do |arr|
if lineN == 0
headers = arr
else
puts "line #{lineN}"

headers.zip(arr).each do |a|
puts "#{a.first} : #{a.last}"
end
end
lineN += 1
end

creates:

line 1
key1 : a
key2 : b
key3 : c

line 2
key1 : d
key2 :
key3 : f

Parse CSV file with headers when the headers are part way down the page

Let's first create the csv file that would be produced from the spreadsheet.

csv =<<-_
N211E,C172,2004,Cessna,172R,airplane,airplane
C-GPGT,C172,1976,Cessna,172M,airplane,airplane
N17AV,P28A,1983,Piper,PA-28-181,airplane,airplane
N4508X,P28A,1975,Piper,PA-28-181,airplane,airplane
,,,,,,
Flights Table,,,,,,

Date,AircraftID,From,To,Route,TimeOut,TimeIn
2017-07-27,N17AV,KHPN,KHPN,KHPN KHPN,17:26,18:08
2017-07-27,N17AV,KHSE,KFFA,,16:29,17:25
2017-07-27,N17AV,W41,KHPN,,21:45,23:53
_

FName = 'test.csv'
File1.write(FName, csv)
#=> 395

We only want the part of the string that begins "Date,".The easiest option is probably to first extract the relevant text. If the file is not humongous, we can slurp it into a string and then remove the unwanted bit.

str = File.read(FName).gsub(/\A.+?(?=^Date,)/m, '')
#=> "Date,AircraftID,From,To,Route,TimeOut,TimeIn\n2017-07-27,N17AV,
# KHPN,KHPN,KHPN KHPN,17:26,18:08\n2017-07-27,N17AV,KHSE,KFFA,,16:29,
# 17:25\n2017-07-27,N17AV,W41,KHPN,,21:45,23:53\n"

The regular expression that is gsub's first argument could be written in free-spacing mode, which makes it self-documenting:

/
\A # match the beginning of the string
.+? # match any number of characters, lazily
(?=^Date,) # match "Date," at the beginning of a line in a positive lookahead
/mx # multi-line and free-spacing regex definition modes

Now that we have the part of the file we want in the string str, we can use CSV::parse to create the CSV::Table object:

csv_tbl = CSV.parse(str, headers: true)
#=> #<CSV::Table mode:col_or_row row_count:4>

The option :headers => true is documented in CSV::new.

Here are a couple of examples of how csv_tbl can be used.

csv_tbl.each { |row| p row }
#=> #<CSV::Row "Date":"2017-07-27" "AircraftID":"N17AV" "From":"KHPN"\
# "To":"KHPN" "Route":"KHPN KHPN" "TimeOut":"17:26" "TimeIn":"18:08">
# #<CSV::Row "Date":"2017-07-27" "AircraftID":"N17AV" "From":"KHSE"\
# "To":"KFFA" "Route":nil "TimeOut":"16:29" "TimeIn":"17:25">
# #<CSV::Row "Date":"2017-07-27" "AircraftID":"N17AV" "From":"W41"\
# "To":"KHPN" "Route":nil "TimeOut":"21:45" "TimeIn":"23:53">

(I've used the character '\' to signify that the string continues on the following line, so that readers would not have to scroll horizontally to read the lines.)

csv_tbl.each { |row| p row["From"] }
# "KHPN"
# "KHSE"
# "W41"

Readers who want to know more about how Ruby's CSV class is used may wish to read Darko Gjorgjievski's piece, "A Guide to the Ruby CSV Library, Part 1 and Part 2".

Parse CSV File, Take Input From Columns, Output to A New Column and Export to a new CSV

Without seeing a snip-it of your CSV file, I can't give you an exact answer. I can, however, use a CSV file I made that should look something like your data. You'll have to fill in the gaps wherever they are.

I've never dealt with parsing CSV files in Ruby before, so I'll step through my problem solving process and hopefully it illustrates how you can solve this type of problem in the future.

First off, here is the CSV file I'm using:

Title, Day, Date, Time
Fun in the Sun, Wed, 09/11/14, 3:00 pm

A quick Google search of Ruby CSV yields the documentation for Ruby's CSV parsing class: http://ruby-doc.org/stdlib-1.9.2/libdoc/csv/rdoc/CSV.html.

Further Googling for Ruby csv headers yields this stack overflow post: Parse CSV file with header fields as attributes for each row

Great, now we can do this:

CSV.foreach('my_file.csv', :headers => true) do |csv_obj|
# Access CSV data here
end

Looking at the documentation for the CSV class, we can also write data to a CSV. Finally, we can solve your problem:

def add_date (date, time)
# 09/11/14 -> 20140911
date_array = date.split("/")
new_date = "2014" + date_array[0] + date_array[1]
output = "TZID=AMERICA/LOS ANGELES:#{new_date}" + "t" + convert_time(time)
end

CSV.open('output_file.csv', 'wb') do |csv_out|
CSV.foreach('input_file.csv', :headers => true) do |csv_in|
csv_out << [csv_in['Title'], csv_in['Day'], add_date(csv_in['Date'], csv_in['Time'])]
end
end

Make sense?

Parsing CSV files in C#, with header

Let a library handle all the nitty-gritty details for you! :-)

Check out FileHelpers and stay DRY - Don't Repeat Yourself - no need to re-invent the wheel a gazillionth time....

You basically just need to define that shape of your data - the fields in your individual line in the CSV - by means of a public class (and so well-thought out attributes like default values, replacements for NULL values and so forth), point the FileHelpers engine at a file, and bingo - you get back all the entries from that file. One simple operation - great performance!

Reading column names alone in a csv file

You can read the header by using the next() function which return the next row of the reader’s iterable object as a list. then you can add the content of the file to a list.

import csv
with open("C:/path/to/.filecsv", "rb") as f:
reader = csv.reader(f)
i = reader.next()
rest = list(reader)

Now i has the column's names as a list.

print i
>>>['id', 'name', 'age', 'sex']

Also note that reader.next() does not work in python 3. Instead use the the inbuilt next() to get the first line of the csv immediately after reading like so:

import csv
with open("C:/path/to/.filecsv", "rb") as f:
reader = csv.reader(f)
i = next(reader)

print(i)
>>>['id', 'name', 'age', 'sex']

How to get headers from a CSV.read on a CSV file with only a header row in ruby

The behavior you are experiencing is expected. A somewhat similar question two years ago had an answer that pointed out the same issue you're having. That person opened a bug report for Ruby where the Ruby devs responded and rejected it. And according to some people that is technically not a well-formed CSV.

However, I agree with you and the person who opened the bug. The headers: true option should fill out the CSV.headers regardless of whether there is actually data on the following lines or not. The current behavior seems baffling and will only lead to bugs in code.

As a quick fix for your issue I would simply pass return_headers: true and begrudgingly skip over the first entry in the result, which will always be the header row.



Related Topics



Leave a reply



Submit