Ruby Find String in File and Print Result

Find and print lines in a file exactly matching string or regexp (Ruby)

This works for me using the english.0 file on that page (sorry, I couldn't find the specific file you mentioned):

a = %w[b a h s v i e y k s a l d n]
dict = {}
a.permutation(5).each do |p|
dict[p.join('')] = true
end

File.open('english.0').each_line do |line|
line.chomp!.downcase!
puts line if dict[line]
end

The structure should be pretty clear - I build the dictionary of permutations up front in one giant hash (you may need to revisit this depending on input sizes, but memory is cheap these days), and then I used the fact that the input was "one word per line" to simply key into that hash.

Also note, in my version, I read through the file only once. In yours you scan the file once per permutation, and there are thousands of permutations.

How to search strings of one file in another file and print line number of the match in ruby?

I assume each line in file 1 appears in at most one line in file 2 and each line of file 2 contains no more than one language in file 1, which is consistent with the example given in the question.

Let's first construct the files. To make life more interesting, I've modified the contents of both files given in the question.

file1 =<<-END
Ruby
C
Visual Basic
C++
R
Objective-C++
Basic
HTML
END

FName1 = 'file1'
File.write(FName1, file1)
#=> 51

file2 =<<-END
5. ab cde fg Java hij kl
2. ab PHP dddf llf
4. cde fg z o Objective-C++ oode
8. a12b cde JavaScript kdk
6. ab99r cde Visual Basic llso dkd
1. lkd dsk Ruby kksdk
3. Python dsdls
7. kdjd C jdjd
9. CSS dkdsk
10. blah C++ blah
7. kkd Basic jjs
3. rooor R kdk
END

FName2 = 'file2'
File.write(FName2, file2)
#=> 256

First read the lines of FName1 into an array.

languages = File.readlines(FName1, chomp:true)
#=> ["Ruby", "C", "Visual Basic", "C++",
# "R", "Objective-C++", "Basic", "HTML"]

Now, for convenience, order the elements of languages be decreasing length.

sorted_languages = languages.sort_by(&:length).reverse
#=> ["Objective-C++", "Visual Basic", "Basic",
# "Ruby", "HTML", "C++", "C", "R"]

I've sorted the elements of languages by decreasing word length so that an attempt to match a line of FName2 with 'Objective-C++' will be made before an attempt is made to match 'C++', and 'C++' will be considered before 'C'. Similarly, 'Visual Basic' will be considered as a match before 'Basic' is considered.

Next, create a hash whose keys are those lines in FName1 that appear in a line of FName2 and whose values are hashes identifying the line number and line in FName2 for the given key.

language_to_file2 = File.foreach(FName2, chomp: true).
with_index(1).
with_object({}) do |(line,n),h|
language = sorted_languages.find { |language| line.include?(language) }
h[language] = { line: line, nbr: n } unless language.nil?
end
#=> {"Objective-C++"=>{:line=>"4. cde fg z o Objective-C++ oode", :nbr=>3},
# "Visual Basic" =>{:line=>"6. ab99r cde Visual Basic llso dkd", :nbr=>5},
# "Ruby" =>{:line=>"1. lkd dsk Ruby kksdk", :nbr=>6},
# "C" =>{:line=>"7. kdjd C jdjd", :nbr=>8},
# "C++" =>{:line=>"10. blah C++ blah", :nbr=>10},
# "Basic" =>{:line=>"7. kkd Basic jjs", :nbr=>11},
# "R" =>{:line=>"3. rooor R kdk", :nbr=>12}}

We may now display the desired result.

languages.each do |language|
print "#{language}|"
if language_to_file2.key?(language)
h = language_to_file2[language]
puts "%d|%s" % [h[:nbr], h[:line]]
else
puts "Not found"
end
end
Ruby|6|1. lkd dsk Ruby kksdk
C|8|7. kdjd C jdjd
Visual Basic|5|6. ab99r cde Visual Basic llso dkd
C++|10|10. blah C++ blah
R|12|3. rooor R kdk
Objective-C++|3|4. cde fg z o Objective-C++ oode
Basic|11|7. kkd Basic jjs
HTML|Not found

How to search file text for a pattern and replace it with a given value

Disclaimer: This approach is a naive illustration of Ruby's capabilities, and not a production-grade solution for replacing strings in files. It's prone to various failure scenarios, such as data loss in case of a crash, interrupt, or disk being full. This code is not fit for anything beyond a quick one-off script where all the data is backed up. For that reason, do NOT copy this code into your programs.

Here's a quick short way to do it.

file_names = ['foo.txt', 'bar.txt']

file_names.each do |file_name|
text = File.read(file_name)
new_contents = text.gsub(/search_regexp/, "replacement string")

# To merely print the contents of the file, use:
puts new_contents

# To write changes to the file, use:
File.open(file_name, "w") {|file| file.puts new_contents }
end

Check if file contains string

I would use:

if File.readlines("testfile.txt").grep(/monitor/).any?

or

if File.readlines("testfile.txt").any?{ |l| l['monitor'] }

Using readlines has scalability issues though as it reads the entire file into an array. Instead, using foreach will accomplish the same thing without the scalability problem:

if File.foreach("testfile.txt").grep(/monitor/).any?

or

if File.foreach("testfile.txt").any?{ |l| l['monitor'] }

See "Why is "slurping" a file not a good practice?" for more information about the scalability issues.

Ruby, How to match strings from one file with another file and print the matched strings

Here's one way:

words = File.read("file1.txt").strip.split(/,/)
words_regex = Regexp.union(words)

File.open("file2.txt").each do |line|
puts line if line =~ words_regex
end

Output:

ruby-443-543-fx
amazing-122-454-nx
awesome-522-65-nx

How to Find a String in a Directory of Files Using grep and glob?

Your code looks almost good except .each missing

i did my test and it works fine.

Dir.glob('spec/features/*.rb').each do |f|
puts 'haha' if File.readlines(f).any?{|line| line.include?('capybara')}
end

Print the first word of the line of matched string in Ruby

Here is an example from a script that I had that does what you are looking for. It's fairly strait forward if you use the each_line method of a file object.

#!/usr/bin/env ruby
regex_to_find = Regexp.new Regexp.escape(ARGV[0])
files = Dir.glob ARGV[1]
files.each do |f|
current_file = File.new f
current_file.each_line do |l|
if l =~ regex_to_find
puts "#{f} #{current_file.lineno}: first word = #{l.split.first}, full line: #{l}"
end
end
end

If you run this script on a directory with a file containing the data you show above. you get the following output. Which is I think what you are looking for.

$ ./q43950329.rb  'class\234ha' "*"
q43950329_data 4: first word = id, full line: id = class\234ha, class\poi23, class\opiuj, cap\7y6t5
q43950329_data 5: first word = dept, full line: dept = sub\6985de, ret\oiu87, class\234ha

Note the above output is in a file called q43950329.rb and the following file exists in the current directory called q43950329_data

CDA created on September 20th 1999
Owner: Edward Jenner
Access IDs,
id = class\234ha, class\poi23, class\opiuj, cap\7y6t5
dept = sub\6985de, ret\oiu87, class\234ha


Related Topics



Leave a reply



Submit