Parsing and structuring of a text file
If the answer to @mudasobwa question "Do you want to grab everything having 88 value?" this is the solution
lines = File.open("file.txt").to_a
lines.map!(&:chomp) # remove line breaks
current_head = ""
res = []
lines.each do |line|
case line
when /Head \d+/
current_head = line
when /\w{1} 88/
res << "#{current_head}, #{line}"
end
end
puts res
Parsing a structured text file in Ruby
You should look for the indicator lines (description, quality, text and stats) in a loop and fill the hash while processing the document line by line.
Another option would be to use regular expressions and parse the document at once, but you don't really need regular expressions here, and if you're not familiar with them, I'd have to recommend against regexes.
UPDATE:
sections = []
File.open("deneme") do |f|
current = {:description => "", :text => "", :quality => "", :stats => ""}
inDescription = false
inQuality = false
f.each_line do |line|
if inDescription
if line.strip == ""
inDescription = false
else
current[:description] += line
end
elsif inQuality
current[:quality] = line.strip
inQuality = false
elsif line.strip == "description"
inDescription = true
elsif line.strip == "quality"
inQuality = true
elsif line.match(/^text: /)
current[:text] = line[6..-1].strip
elsif line.match(/^stats /)
current[:stats] = line[6..-1].strip
sections.push(current)
current = {:description => "", :text => "", :quality => "", :stats => ""}
end
end
end
[ruby]Get file, parse text and create date object
Something like this
holidays = File.read('holidays.txt').split(/\n/).map do |row|
date, holiday_name = row.split(';')
date = Date.parse(date, '%d.%m.%Y')
[date, holiday_name]
end.to_h
=> {
#<Date: 2017-01-01 ((2457755j,0s,0n),+0s,2299161j)> => "New Year",
#<Date: 2017-04-16 ((2457860j,0s,0n),+0s,2299161j)> => "Easter",
#<Date: 2017-12-25 ((2458113j,0s,0n),+0s,2299161j)> => "Christmas"
}
Parsing lines of text from external file in Ruby
Try this
raw_email = File.open("sample-email.txt", "r")
parsed_email = {}
raw_email.each do |line|
case line.split(":")[0]
when "Delivered-To"
parsed_email[:to] = line
when "From"
parsed_email[:from] = line
when "Date"
parsed_email[:date] = line
when "Subject"
parsed_email[:subject] = line
end
end
puts parsed_email
=> {:to=>"Delivered-To: user1@example.com\n", :from=>"From: John Doe <user2@example.com>\n", :date=>"Date: Tue, 12 Dec 2017 13:30:14 -0500\n", :subject=>"Subject: Testing the parser\n"}
Explanation
You need to split line on :
and select first. Like this line.split(":")[0]
How to parse a text file containing multiple lines of data and organized by numerical values and then convert to JSON
This is a very common type of encoding called Type-Length-Value (or Tag-Length-Value), for reasons I suppose are obvious. As with many such tasks in Ruby, String#unpack
is a good fit:
def decode(data)
return {} if data.empty?
key, len, rest = data.unpack("a2 a2 a*")
val = rest.slice!(0, len.to_i)
{ key => val }.merge(decode(rest))
end
p decode("HD040008000415350110XXXXXXXXXX0208XXXXXXXX0302EN0403USA0502EN0604000107014")
# => {"HD"=>"0008", "00"=>"1535", "01"=>"XXXXXXXXXX", "02"=>"XXXXXXXX", "03"=>"EN", "04"=>"USA", "05"=>"EN", "06"=>"0001", "07"=>"4"}
p decode("EM04000800030010112TME001205IQ50232Blue Point Coastal Cuisine. INC.0614565 5th Avenue0805921010909SAN DIEGO1008Downtown1102CA1203USA")
# => {"EM"=>"0008", "00"=>"001", "01"=>"TME001205IQ5", "02"=>"Blue Point Coastal Cuisine. INC.", "06"=>"565 5th Avenue", "08"=>"92101", "09"=>"SAN DIEGO", "10"=>"Downtown", "11"=>"CA", "12"=>"USA"}
If you want to read an entire file and return a JSON array of objects, something like this would suffice:
#!/usr/bin/env ruby -n
BEGIN {
require "json"
def decode(data)
# ...
end
arr = []
}
arr << decode($_.chomp)
END { puts arr.to_json }
Then (supposing the script is called script.rb
and is executable:
$ cat data.txt | ./script.rb > out.json
Parsing a .txt file to key/value pairs in Ruby
You can do
array = []
# open the file in read mode. With block version you don'r need to
# worry about to close the file by hand. It will be closed when the
# read operation will be completed.
File.open('path/to/file', 'r') do |file|
# each_line gives an Enumerator object. On which I'm calling
# each_slice to take 2 lines at a time, where first line is the
# question, and the second one is the answer.
file.each_line.each_slice(2).do |question, answer|
array << {'Question' => question, 'Answer' => answer}
end
end
Related Topics
How to Save Dates in Local Timezone to Db with Rails3
Rails 4 - Syntax Error, Unexpected Tidentifier, Expecting End-Of-Input
Restful File Uploads with Carrierwave
Gem Not Found in Ruby Cron Job in Rvm Env
Setting Up Facets in Elasticsearch with Searchkick Gem in Rails 4.1
Why Can't I Install SASS on MAC Os Sierra
How to Format a String with Floats in Ruby Using #{Variable}
How to Expect Some (But Not All) Arguments with Rspec Should_Receive
Other Ruby Map Shorthand Notation
How to Assert Certain Method Is Called with Ruby Minitest Framework
Change Default Capybara Browser Window Size
Is It Possible in Rails to Check Whether a Redirect or Render Had Already Been Issued
Rails Force Models to Eager Load
Ruby If .. Elsif .. Else on a Single Line
Getting the Full Rspec Test Name from Within a Before(:Each) Block