How to Read the Content of an Excel Spreadsheet Using Ruby

How do I read the content of an Excel spreadsheet using Ruby?

It looks like row, whose class is Spreadsheet::Excel::Row is effectively an Excel Range and that it either includes Enumerable or at least exposes some enumerable behaviours, #each, for example.

So you might rewrite your script something like this:

require 'spreadsheet'    
book = Spreadsheet.open('myexcel.xls')
sheet1 = book.worksheet('Sheet1') # can use an index or worksheet name
sheet1.each do |row|
break if row[0].nil? # if first cell empty
puts row.join(',') # looks like it calls "to_s" on each cell's Value
end

Note that I've parenthesised arguments, which is generally advisable these days, and removed the semi-colons, which are not necessary unless you're writing multiple statement on a line (which you should rarely - if ever - do).

It's probably a hangover from a larger script, but I'll point out that in the code given the book and sheet1 variables aren't really needed, and that Spreadsheet#open takes a block, so a more idiomatic Ruby version might be something like this:

require 'spreadsheet'    
Spreadsheet.open('MyTestSheet.xls') do |book|
book.worksheet('Sheet1').each do |row|
break if row[0].nil?
puts row.join(',')
end
end

Extract Data from Excel Spreadsheet into Database in Ruby

Your table has ukprn and name as respective columns, so find_or_create should look like :

Institute.find_or_create_by(ukprn: ukprn , name: name)

Now you just need to initialize ukprn and name from row.

require 'roo'

xlsx = Roo::Excelx.new(File.expand_path('../Downloads/UKPRN.xlsx'))

xlsx.each_row_streaming(offset: 1) do |row|
Institute.find_or_create_by(ukprn: row[0].value, name: row[1].value)
end

To execute this code, either :

  • put it in db/seeds.rb and execute rake db:seed
  • put it in script.rb and run rails runner script.rb
  • copy-paste it in console (not really recommended)

Ruby: Reading contents of a xls file and getting each cells information

require 'rubygems'
require 'open-uri'
require 'spreadsheet'

rows = Array.new
temp_rows = Array.new
column_headers = Array.new
index = 0
url = 'http://www.stats.gov.cn/tjsj/ndsj/2012/html/C0201e.xls'
doc = Spreadsheet.open (open(url))
sheet1 = doc.worksheet 0
sheet1.each do |row|
rows << row.to_a
end

rows.each_with_index do |row,ind|
if row[0]=="Year"
index = ind
break
end
end

(index..7).each do |i|
# puts rows[i].inspect
if rows[i][0] =~ /[0-9]/
break
else
temp_rows << rows[i]
end
end

col_size = temp_rows[0].size
# puts temp_rows.inspect

col_size.times do |c|
temp_str = ""
temp_rows.each do |row|
temp_str +=' '+ row[c] unless row[c].nil?
end
# puts temp_str.inspect
column_headers << temp_str unless temp_str.nil?
end
puts 'Column Headers of this xls file are : '
# puts column_headers.inspect
column_headers.each do |col|
puts col.strip.inspect if col.length >1
end

How to read excel values using queries in ruby?

You can use Sequel and OLEDB to read Excel Files:

require 'sequel'
Encoding.default_external = 'utf-8' #needed for umlauts in excel

def read_excel(source)
source = File.expand_path(source) #Full path needed

db = Sequel.ado(:conn_string=>"Provider=Microsoft.Jet.OLEDB.4.0;Data Source=#{source};Extended Properties=Excel 8.0;")
# Excel 2000 (for table names, use a dollar after the sheet name, e.g. Sheet1$)
p db.test_connection

dataset = db[:'Tabelle1$']
p dataset
dataset.each{|row|
puts row
}
end #test_read

read_excel('my_spreadsheet.xls')

You should know the name of the tab (in my example it's Tabelle1)


The 'real' solution here is not Sequel, but the ADO-Interface. I'm not familiar with other ORM, so I may not really help you. But you may check for example active record.

There are hints, how to connect MS-Access or sqlserver via ADO, some use ActiveRecord.
If you replace the connection string with the Excel-String in my Sequel example, then you may use other ORMs.

You may also try to read Excel-Data via an ODBC-connection.

How to parse a single excel row data in rails

I recommend to use BatchFactory gem.

It uses Roo gem under the hood.

BatchFactory can read all excel file rows as array of hashes which is very handy to work with.

require 'batch_factory'
factory = BatchFactory.from_file 'filename.xlsx', keys: [:header1, :header2]
factory.rows

This will give you

[
{ header1: 'value11', header2: 'value12' },
{ header1: 'value21', header2: 'value22' },
...
]

In your case you can do

factory = BatchFactory.from_file 'filename.xlsx', keys: [:firstname]
firstnames = factory.rows.map { |row| row[:firstname] }

This will give your an array of all values from firstname column.

UPDATE

You can even omit rows in factory.rows.map because BatchFactory implement some method_missing, i.e.

firstnames = factory.map { |row| row[:firstname] }

Ruby Roo Gem - read Excel xlsx sheet into Hash

Work on sheet

# Open the workbook
wb = Roo::Spreadsheet.open '/Users/ankur/Desktop/wb.xlsx'
# Get first sheet
sheet = wb.sheet(0)
# Call #parse on that
sheet.parse(Fruits: "Fruits", Qty: "Qty", Location:"Location", clean:true)
#=> [{:Fruits=>"apples", :Qty=>5, :Location=>"Kitchen"}, {:Fruits=>"pearls", :Qty=>10, :Location=>"Bag"}, {:Fruits=>"plums", :Qty=>15, :Location=>"Bagpack"}]

Reading and writing Excel files using Ruby on a server without Excel installed

I agree with Gonzih, and I use roo fairly regularly. It allows me to read, write, and write using a template file.
The project is fairly well documented on their site.

I always use something like:

input = Excel.new(path)
output = Array.new
input.default_sheet = input.sheets[sheet]
start.upto(input.last_row) do |row|
output << input.row(row)
end

p output
=> a nested array representing the spreadsheat.

p output[0]
=> [row1_column_a, row1_column_b...]

to read a spreadsheet. note that the roo gem requires you to use Excelx.new instead of Excel.new if your file is a .xlsx.

to write you can:

book = Spreadsheet::Workbook.new
write_sheet = book.create_worksheet
row_num = 0
input.each do |row|
write_sheet.row(row_num).replace row
row_num +=1
end
book.write "/path/to/save/to.xls"

where input is an array structured just like output was

Parsing XLS and XLSX (MS Excel) files with Ruby?

Just found roo, that might do the job - works for my requirements, reading a basic spreadsheet.

Fastest way to read the first row of big XLSX file in Ruby

The ruby gem roo does not support file streaming; it reads the whole file into memory. Which, as you say, works fine for smaller files but not so well for reading small sections of huge files.

You need to use a different library/approach. For example, you can use the gem: creek, which describes itself as:

a Ruby gem that provides a fast, simple and efficient method of parsing large Excel (xlsx and xlsm) files.

And, taking the example from the project's README, it's pretty straightforward to translate the code you wrote for roo into code that uses creek:

require 'creek'
creek = Creek::Book.new(file_path)
sheet = creek.sheets[0]
header = sheet.rows[0]

Note: A quick google of your StackOverflow question title led me to this blog post as the top search result. It's always worth searching on Google first.

Reading Excel formulae using Ruby

A different approach is:

  • convert it to csv using xls2csv: http://linux.die.net/man/1/xls2csv

  • read it using the ruby standard lib: http://ruby-doc.org/stdlib-1.9.2/libdoc/csv/rdoc/CSV.html

I hope this can help you.



Related Topics



Leave a reply



Submit