How do I read the content of an Excel spreadsheet using Ruby?
It looks like row
, whose class is Spreadsheet::Excel::Row
is effectively an Excel Range
and that it either includes Enumerable or at least exposes some enumerable behaviours, #each
, for example.
So you might rewrite your script something like this:
require 'spreadsheet'
book = Spreadsheet.open('myexcel.xls')
sheet1 = book.worksheet('Sheet1') # can use an index or worksheet name
sheet1.each do |row|
break if row[0].nil? # if first cell empty
puts row.join(',') # looks like it calls "to_s" on each cell's Value
end
Note that I've parenthesised arguments, which is generally advisable these days, and removed the semi-colons, which are not necessary unless you're writing multiple statement on a line (which you should rarely - if ever - do).
It's probably a hangover from a larger script, but I'll point out that in the code given the book
and sheet1
variables aren't really needed, and that Spreadsheet#open
takes a block, so a more idiomatic Ruby version might be something like this:
require 'spreadsheet'
Spreadsheet.open('MyTestSheet.xls') do |book|
book.worksheet('Sheet1').each do |row|
break if row[0].nil?
puts row.join(',')
end
end
Extract Data from Excel Spreadsheet into Database in Ruby
Your table has ukprn
and name
as respective columns, so find_or_create
should look like :
Institute.find_or_create_by(ukprn: ukprn , name: name)
Now you just need to initialize ukprn
and name
from row
.
require 'roo'
xlsx = Roo::Excelx.new(File.expand_path('../Downloads/UKPRN.xlsx'))
xlsx.each_row_streaming(offset: 1) do |row|
Institute.find_or_create_by(ukprn: row[0].value, name: row[1].value)
end
To execute this code, either :
- put it in
db/seeds.rb
and executerake db:seed
- put it in
script.rb
and runrails runner script.rb
- copy-paste it in console (not really recommended)
Ruby: Reading contents of a xls file and getting each cells information
require 'rubygems'
require 'open-uri'
require 'spreadsheet'
rows = Array.new
temp_rows = Array.new
column_headers = Array.new
index = 0
url = 'http://www.stats.gov.cn/tjsj/ndsj/2012/html/C0201e.xls'
doc = Spreadsheet.open (open(url))
sheet1 = doc.worksheet 0
sheet1.each do |row|
rows << row.to_a
end
rows.each_with_index do |row,ind|
if row[0]=="Year"
index = ind
break
end
end
(index..7).each do |i|
# puts rows[i].inspect
if rows[i][0] =~ /[0-9]/
break
else
temp_rows << rows[i]
end
end
col_size = temp_rows[0].size
# puts temp_rows.inspect
col_size.times do |c|
temp_str = ""
temp_rows.each do |row|
temp_str +=' '+ row[c] unless row[c].nil?
end
# puts temp_str.inspect
column_headers << temp_str unless temp_str.nil?
end
puts 'Column Headers of this xls file are : '
# puts column_headers.inspect
column_headers.each do |col|
puts col.strip.inspect if col.length >1
end
How to read excel values using queries in ruby?
You can use Sequel and OLEDB to read Excel Files:
require 'sequel'
Encoding.default_external = 'utf-8' #needed for umlauts in excel
def read_excel(source)
source = File.expand_path(source) #Full path needed
db = Sequel.ado(:conn_string=>"Provider=Microsoft.Jet.OLEDB.4.0;Data Source=#{source};Extended Properties=Excel 8.0;")
# Excel 2000 (for table names, use a dollar after the sheet name, e.g. Sheet1$)
p db.test_connection
dataset = db[:'Tabelle1$']
p dataset
dataset.each{|row|
puts row
}
end #test_read
read_excel('my_spreadsheet.xls')
You should know the name of the tab (in my example it's Tabelle1)
The 'real' solution here is not Sequel, but the ADO-Interface. I'm not familiar with other ORM, so I may not really help you. But you may check for example active record.
There are hints, how to connect MS-Access or sqlserver via ADO, some use ActiveRecord.
If you replace the connection string with the Excel-String in my Sequel example, then you may use other ORMs.
You may also try to read Excel-Data via an ODBC-connection.
How to parse a single excel row data in rails
I recommend to use BatchFactory gem.
It uses Roo gem under the hood.
BatchFactory can read all excel file rows as array of hashes which is very handy to work with.
require 'batch_factory'
factory = BatchFactory.from_file 'filename.xlsx', keys: [:header1, :header2]
factory.rows
This will give you
[
{ header1: 'value11', header2: 'value12' },
{ header1: 'value21', header2: 'value22' },
...
]
In your case you can do
factory = BatchFactory.from_file 'filename.xlsx', keys: [:firstname]
firstnames = factory.rows.map { |row| row[:firstname] }
This will give your an array of all values from firstname
column.
UPDATE
You can even omit rows
in factory.rows.map
because BatchFactory implement some method_missing
, i.e.
firstnames = factory.map { |row| row[:firstname] }
Ruby Roo Gem - read Excel xlsx sheet into Hash
Work on sheet
# Open the workbook
wb = Roo::Spreadsheet.open '/Users/ankur/Desktop/wb.xlsx'
# Get first sheet
sheet = wb.sheet(0)
# Call #parse on that
sheet.parse(Fruits: "Fruits", Qty: "Qty", Location:"Location", clean:true)
#=> [{:Fruits=>"apples", :Qty=>5, :Location=>"Kitchen"}, {:Fruits=>"pearls", :Qty=>10, :Location=>"Bag"}, {:Fruits=>"plums", :Qty=>15, :Location=>"Bagpack"}]
Reading and writing Excel files using Ruby on a server without Excel installed
I agree with Gonzih, and I use roo fairly regularly. It allows me to read, write, and write using a template file.
The project is fairly well documented on their site.
I always use something like:
input = Excel.new(path)
output = Array.new
input.default_sheet = input.sheets[sheet]
start.upto(input.last_row) do |row|
output << input.row(row)
end
p output
=> a nested array representing the spreadsheat.
p output[0]
=> [row1_column_a, row1_column_b...]
to read a spreadsheet. note that the roo gem requires you to use Excelx.new
instead of Excel.new
if your file is a .xlsx.
to write you can:
book = Spreadsheet::Workbook.new
write_sheet = book.create_worksheet
row_num = 0
input.each do |row|
write_sheet.row(row_num).replace row
row_num +=1
end
book.write "/path/to/save/to.xls"
where input is an array structured just like output was
Parsing XLS and XLSX (MS Excel) files with Ruby?
Just found roo, that might do the job - works for my requirements, reading a basic spreadsheet.
Fastest way to read the first row of big XLSX file in Ruby
The ruby gem roo
does not support file streaming; it reads the whole file into memory. Which, as you say, works fine for smaller files but not so well for reading small sections of huge files.
You need to use a different library/approach. For example, you can use the gem: creek
, which describes itself as:
a Ruby gem that provides a fast, simple and efficient method of parsing large Excel (xlsx and xlsm) files.
And, taking the example from the project's README, it's pretty straightforward to translate the code you wrote for roo
into code that uses creek
:
require 'creek'
creek = Creek::Book.new(file_path)
sheet = creek.sheets[0]
header = sheet.rows[0]
Note: A quick google of your StackOverflow question title led me to this blog post as the top search result. It's always worth searching on Google first.
Reading Excel formulae using Ruby
A different approach is:
convert it to csv using xls2csv: http://linux.die.net/man/1/xls2csv
read it using the ruby standard lib: http://ruby-doc.org/stdlib-1.9.2/libdoc/csv/rdoc/CSV.html
I hope this can help you.
Related Topics
In Rails - Is There a Rails Method to Convert Newlines to <Br>
How to Get the Current Time as 13-Digit Integer in Ruby
How to Make Ruby 1.9 the Default Ruby on Ubuntu
Problems with the Rails Console, Rvm and Readline
How Does Shovel (<<) Operator Work in Ruby Hashes
Thread Safety: Class Variables in Ruby
Ruby Hash Default Value Behavior
Why Does My Ruby 'Ri' Tool Not Return Results in Command Prompt
Ruby Source Code Analyzer (Something Like Pylint)
"Whenever" Gem Running Cron Jobs on Heroku
How to Know When to "Refresh" My Model Object in Rails
How to Simulate Java-Like Annotations in Ruby
How to Convert a Ruby Hash to Xml
Keep Form Fields Filled After an Error (Ror)
Unexpected Output in Ruby on Rails