Parsing XLS and XLSX (MS Excel) files with Ruby?
Just found roo, that might do the job - works for my requirements, reading a basic spreadsheet.
Is there any Ruby gem to read both .xls and .xlsx files?
I've had success using roo with the roo-xls extension.
Fastest way to read the first row of big XLSX file in Ruby
The ruby gem roo
does not support file streaming; it reads the whole file into memory. Which, as you say, works fine for smaller files but not so well for reading small sections of huge files.
You need to use a different library/approach. For example, you can use the gem: creek
, which describes itself as:
a Ruby gem that provides a fast, simple and efficient method of parsing large Excel (xlsx and xlsm) files.
And, taking the example from the project's README, it's pretty straightforward to translate the code you wrote for roo
into code that uses creek
:
require 'creek'
creek = Creek::Book.new(file_path)
sheet = creek.sheets[0]
header = sheet.rows[0]
Note: A quick google of your StackOverflow question title led me to this blog post as the top search result. It's always worth searching on Google first.
Ruby Roo Gem - read Excel xlsx sheet into Hash
Work on sheet
# Open the workbook
wb = Roo::Spreadsheet.open '/Users/ankur/Desktop/wb.xlsx'
# Get first sheet
sheet = wb.sheet(0)
# Call #parse on that
sheet.parse(Fruits: "Fruits", Qty: "Qty", Location:"Location", clean:true)
#=> [{:Fruits=>"apples", :Qty=>5, :Location=>"Kitchen"}, {:Fruits=>"pearls", :Qty=>10, :Location=>"Bag"}, {:Fruits=>"plums", :Qty=>15, :Location=>"Bagpack"}]
Single Ruby Gem that parses BOTH xlsx and xls Excel files?
I would just combine the rubyXL gem and the spreadsheet gem if you're happy with the individual results both provide.
How to parse a single excel row data in rails
I recommend to use BatchFactory gem.
It uses Roo gem under the hood.
BatchFactory can read all excel file rows as array of hashes which is very handy to work with.
require 'batch_factory'
factory = BatchFactory.from_file 'filename.xlsx', keys: [:header1, :header2]
factory.rows
This will give you
[
{ header1: 'value11', header2: 'value12' },
{ header1: 'value21', header2: 'value22' },
...
]
In your case you can do
factory = BatchFactory.from_file 'filename.xlsx', keys: [:firstname]
firstnames = factory.rows.map { |row| row[:firstname] }
This will give your an array of all values from firstname
column.
UPDATE
You can even omit rows
in factory.rows.map
because BatchFactory implement some method_missing
, i.e.
firstnames = factory.map { |row| row[:firstname] }
Related Topics
Install Rails 3 on Osx with Rvm
Active Record - Find Records Which Were Created_At Before Today
Ruby CSV - Get Current Line/Row Number
What Are the Differences Between "Private", "Public", and "Protected Methods"
Simple Encryption in Ruby Without External Gems
Digital Signature Verification with Openssl
Add Method to an Instanced Object
Using Implicit 'Subject' with 'Expect' in Rspec-2.11
Rails: Unpermitted Parameter in Rails 5
Spinning Background Tasks in Rails
Ruby on Rails: Can You Put Ruby Code in a Yaml Config File
Is There Equivalent for PHP's Print_R in Ruby/Rails
Gem Install Error (Sass Compass)
Ruby Gemspec Dependency: Is Possible Have a Git Branch Dependency