Fastest way to skip lines while parsing files in Ruby?
file.lines.drop(500).take(100) # will get you lines 501-600
Generally, you can't avoid reading file from the start until the line you are interested in, as each line can be of different length. The one thing you can avoid, though, is loading whole file into a big array. Just read line by line, counting, and discard them until you reach what you look for. Pretty much like your own example. You can just make it more Rubyish.PS. the Tin Man's comment made me do some experimenting. While I didn't find any reason why would drop
load whole file, there is indeed a problem: drop
returns the rest of the file in an array. Here's a way this could be avoided:
file.lines.select.with_index{|l,i| (501..600) === i}
PS2: Doh, above code, while not making a huge array, iterates through the whole file, even the lines below 600. :( Here's a third version:enum = file.lines
500.times{enum.next} # skip 500
enum.take(100) # take the next 100
or, if you prefer FP:file.lines.tap{|enum| 500.times{enum.next}}.take(100)
Anyway, the good point of this monologue is that you can learn multiple ways to iterate a file. ;) Ruby - How to skip/ignore specific lines when reading a file?
You could do it like this:
a = ["#","Feature","In order","As a","I want"]
File.open(file).each_line do |line|
line.chomp!
next if line.empty? || a.any? { |a| line =~ /#{a}/ }
end
Skipping the first line when reading in a file in 1.9.3
Change each
to each_with_index do |line, index|
and next if index == 0
will work.
Ruby CSV: How to skip the first two lines of file?
I didn't benchmark, but try this:
CSV.to_enum(:foreach, filename, col_sep: "\t").drop(2).each do |row|
Skip first 5 lines of CSV
You should be able to bypass the CSV module by constructing a valid CSV string from your otherwise incompatible data:
CSV.parse(File.readlines(path).drop(5).join) do |row|
# ...
end
How to skip the first line of a CSV file and make the second line the header
I don't think there's an elegant way of doing it, but it can be done:
require "csv"
# Create a stream using the original file.
# Don't use `textmode` since it generates a problem when using this approach.
file = File.open "file.csv"
# Consume the first CSV row.
# `\r` is my row separator character. Verify your file to see if it's the same one.
loop { break if file.readchar == "\r" }
# Create your CSV object using the remainder of the stream.
csv = CSV.new file, headers: true
Using ruby to find a word or phrase in a text file capture the word skip a line and then read the line until a blank (repeat)
Here's mine:
data.scan(/(MATCH ME)(.*?)\n\n((?:(?!\n\n).)*)/m).each do |m, n, lines|
lines.each_line do |line|
puts [m, n, *line.unpack('A9A10A*')].map(&:strip).join(',')
end
end
That regex is ugly, but still better than looking at 30 lines.(?:(?!\n\n).)* means match any char that is not followed by 2 newlines. the (?:) is so it doesn't also capture the '.'
Related Topics
Need Help on Reading Emails with "Mail" Gem in Ruby
Capybara & Cucumber | Getting Cookies
How to Send an Image on The Web in an Xmpp (Jabber) Message
Ruby: How to Convert an Array of Data to Hash and to JSON Format
Automating Ssh to Windows with Ruby
How to Programmatically Remove "Singleton Information" on an Instance to Make It Marshal
Running Rails Console with Bundle Exec
Case Insensitive Search in Rails
Clean Install Osx 10.9.1 Returns "Undefined Method 'Path2Class'" When Trying to Install Gems
How to Get Meta Keywords Using Nokogiri
How to Automatically Escape HTML Content Using Jekyll and Markdown
Ruby Readline Fails If Process Started with Arguments
Sorting a Multidimensional Array in Ruby
Do Ruby Objects Have a Size Limit