Equivalent to Python's Findall() Method in Ruby

Equivalent to Python’s findall() method in Ruby?

f = File.new("tracklist.txt", "r")
s = f.read
s.scan(/mmc.+?mp3/) do |track|
puts track
end

What this code does is open the file for reading and reads the contents as a string into variable s. Then the string is scanned for the regular expression /mmc.+?mp3/ (String#scan collects an array of all matches), and prints each one it finds.

Python equivalent of Ruby's .find

Below code in python for your ruby logic.

CF={"metre":{"kilometre":0.001, "metre":1.0, "centimetre":100.0}, "litre":{"litre":1.0, "millilitre":1000.0, "imperial_pint":1.75975}}

def common(fr,to):
for key,value in CF.items():
if (fr in value) and (to in value):
return key

print(common('metre','centimdetre'))
metre
print(com('metre','centimdetre'))
None
******************

single line function
com = lambda x,y:[key for key,value in CF.items() if (x in value) and (y in value)]
print(com('metre','centimdetre'))
['metre']

How to access the various occurences of the same match group in Ruby Regular expressions ?

To expand on my comment and respond to your question:

If you want to store the values in an array, modify the block and collect instead of iterate:

> arr = xml.grep(/<DATA size="(\d+)"/).collect { |d| d.match /\d+/ }
> arr.each { |a| puts "==> #{a}" }
==> 916
==> 229885

The |d| is normal Ruby block parameter syntax; each d is the matching string, from which the number is extracted. It's not the cleanest Ruby, although it's functional.

I still recommend using a parser; note that the rexml version would be this (more or less):

require 'rexml/document'
include REXML
doc = Document.new xml
arr = doc.elements.collect("//DATA") { |d| d.attributes["size"] }
arr.each { |a| puts "==> #{a}" }

Once your "XML" is converted to actual XML you can get even more useful data:

doc = Document.new xml
arr = doc.elements.collect("//file") do |f|
name = f.elements["FILENAME"].attributes["path"]
size = f.elements["DATA"].attributes["size"]
[name, size]
end

arr.each { |a| puts "#{a[0]}\t#{a[1]}" }

~/Users/1.txt 916
~/Users/2.txt 229885

How to check whether a string contains a substring in Ruby

You can use the include? method:

my_string = "abcdefg"
if my_string.include? "cde"
puts "String includes 'cde'"
end

How to use symbolic group name using re.findall()

You can't do that with .findall(). However, you can achieve the same effect with .finditer() and some list comprehension magic:

print [m.groupdict() for m in re.finditer('toto=(?P<toto>\d+)\,\sbip=(?P<bip>\w+)', my_str)]

This prints:

[{'toto': '1', 'bip': 'xyz'}, {'toto': '15', 'bip': 'abu'}]

So we loop over each match yielded by .finditer() and take it's .groupdict() result.

Create array of substrings with known start and end point from a large string

After the Update I think this is what you are looking for:

def findProteins(dnaString)
index = 0
dnaSubStrings = []
while index < dnaString.length
dnaSubStrings.push(dnaString[/ATG(?:.{3})*(?:TGA|TAG|TAA)/, index])
index += 3
end
return dnaSubStrings
end

I've tried it out on this online compiler, but it only found one big protein, is ti right? But I think the logic is more or less like this or at least is close.

Find all associated objects by specific condition

class QuestionGroup < ActiveRecord::Base
has_many :questions

def self.without_answers(user_id)
joins(%"inner join questions on question_groups.id = questions.question_group_id
inner join question_answers
on question_answers.question_id = questions.id
inner join question_groups
on question_answers.question_users_answers_id = question_users_answers.id").where("user_question_answers.user_id" => user_id).select { |qq| ... }
end
end
end

You can change some of the inner joins to left out join to pick up records where the table you are joining to doesn't have a match, for instance where there is no answer. The fields of the table you are joining to will have NULL values for all the fields. Adding a where id is null will even filter to just the questions with no answers.

Keep in mind that this is just an alternative technique. You could programmatically solve the problem simple with:

class QuestionGroup
def self.question_groups_without_answers(user_id)
select {|qq| qq.question_users_answers.where(:user_id=>user_id).empty?}.map{ |qq| qq.question_group }
end
end

An advantage of doing the joins is that the database does all the work, and you don't send several SQL queries to the database, so it can be much faster.

Learning Python from Ruby; Differences and Similarities

Here are some key differences to me:

  1. Ruby has blocks; Python does not.

  2. Python has functions; Ruby does not. In Python, you can take any function or method and pass it to another function. In Ruby, everything is a method, and methods can't be directly passed. Instead, you have to wrap them in Proc's to pass them.

  3. Ruby and Python both support closures, but in different ways. In Python, you can define a function inside another function. The inner function has read access to variables from the outer function, but not write access. In Ruby, you define closures using blocks. The closures have full read and write access to variables from the outer scope.

  4. Python has list comprehensions, which are pretty expressive. For example, if you have a list of numbers, you can write

    [x*x for x in values if x > 15]

    to get a new list of the squares of all values greater than 15. In Ruby, you'd have to write the following:

    values.select {|v| v > 15}.map {|v| v * v}

    The Ruby code doesn't feel as compact. It's also not as efficient since it first converts the values array into a shorter intermediate array containing the values greater than 15. Then, it takes the intermediate array and generates a final array containing the squares of the intermediates. The intermediate array is then thrown out. So, Ruby ends up with 3 arrays in memory during the computation; Python only needs the input list and the resulting list.

    Python also supplies similar map comprehensions.

  5. Python supports tuples; Ruby doesn't. In Ruby, you have to use arrays to simulate tuples.

  6. Ruby supports switch/case statements; Python does not.

  7. Ruby supports the standard expr ? val1 : val2 ternary operator; Python does not.

  8. Ruby supports only single inheritance. If you need to mimic multiple inheritance, you can define modules and use mix-ins to pull the module methods into classes. Python supports multiple inheritance rather than module mix-ins.

  9. Python supports only single-line lambda functions. Ruby blocks, which are kind of/sort of lambda functions, can be arbitrarily big. Because of this, Ruby code is typically written in a more functional style than Python code. For example, to loop over a list in Ruby, you typically do

    collection.each do |value|
    ...
    end

    The block works very much like a function being passed to collection.each. If you were to do the same thing in Python, you'd have to define a named inner function and then pass that to the collection each method (if list supported this method):

    def some_operation(value):
    ...

    collection.each(some_operation)

    That doesn't flow very nicely. So, typically the following non-functional approach would be used in Python:

    for value in collection:
    ...
  10. Using resources in a safe way is quite different between the two languages. Here, the problem is that you want to allocate some resource (open a file, obtain a database cursor, etc), perform some arbitrary operation on it, and then close it in a safe manner even if an exception occurs.

    In Ruby, because blocks are so easy to use (see #9), you would typically code this pattern as a method that takes a block for the arbitrary operation to perform on the resource.

    In Python, passing in a function for the arbitrary action is a little clunkier since you have to write a named, inner function (see #9). Instead, Python uses a with statement for safe resource handling. See How do I correctly clean up a Python object? for more details.

Python Regex Capture Only Certain Text

re.findall(r'\{(.+?)\}', request.params['upsell'])

This will return a list where each entry is the contents of a different group of curly braces. Note that this will not work for nested braces.

The ? after the .+ will make it a lazy match (as opposed to greedy). This means that the match will stop at the first "}", instead of continuing to match as many characters as possible and ending on the last closing brace.

re.findall() will search through your string and find all non-overlapping matches, and return the group. Alternatively you could use re.finditer() which will iterate over Match objects, but then you would need to use match.group(1) to get only what it inside of the braces. This is also what you would need to change in your example, match.group() returns the entire match not the captured group, for that you need to put the number for the group you want.

Why doesn't python's re.findall return all the found substrings in my example?

help(re.findall)

Help on function findall in module re:

findall(pattern, string, flags=0)

Return a list of all
non-overlapping matches in the string.

If one or more groups are present in the pattern, return a

list of groups; this will be a list of tuples if the pattern has more than
one group.

Empty matches are included in the result.

Since the two results overlap (both have the '2' in them) only the first one will be returned.

if instead you will have t='1 2 3 4' the result will be ['1 2', '3 4'].



Related Topics



Leave a reply



Submit