Are there something like Python generators in Ruby?
Ruby's yield
keyword is something very different from the Python keyword with the same name, so don't be confused by it. Ruby's yield
keyword is syntactic sugar for calling a block associated with a method.
The closest equivalent is Ruby's Enumerator class. For example, the equivalent of the Python:
def eternal_sequence():
i = 0
while True:
yield i
i += 1
is this:
def eternal_sequence
Enumerator.new do |enum|
i = 0
while true
enum.yield i # <- Notice that this is the yield method of the enumerator, not the yield keyword
i +=1
end
end
end
You can also create Enumerators for existing enumeration methods with enum_for
. For example, ('a'..'z').enum_for(:each_with_index)
gives you an enumerator of the lowercase letters along with their place in the alphabet. You get this for free with the standard Enumerable methods like each_with_index
in 1.9, so you can just write ('a'..'z').each_with_index
to get the enumerator.
Python yield vs Ruby yield
In ruby, yield is a shortcut that is used to call an anonymous function. Ruby has a special syntax for passing an anonymous function to a method; the syntax is known as a block
. Because the function has no name, you use the name yield to call the function:
def do_stuff(val)
puts "Started executing do_stuff"
yield(val+3)
yield(val+4)
puts "Finshed executing do_stuff"
end
do_stuff(10) {|x| puts x+3} #<= This is a block, which is an anonymous function
#that is passed as an additional argument to the
#method do_stuff
--output:--
Started executing do_stuff
16
17
Finshed executing do_stuff
In python, when you see yield inside a function definition, that means that the function is a generator
. A generator is a special type of function that can be stopped mid execution and restarted. Here's an example:
def do_stuff(val):
print("Started execution of do_stuff()")
yield val + 3
print("Line after 'yield val + 3'")
yield val + 4
print("Line after 'yield val + 4'")
print("Finished executing do_stuff()")
my_gen = do_stuff(10)
val = next(my_gen)
print("--received {} from generator".format(val))
output:
Started execution of do_stuff()
--received 13 from generator
More code:
val = next(my_gen)
print("--received {} from generator".format(val))
output:
Line after 'yield val + 3'
--received 14 from generator
From the output, you can see that yield
causes a result to be returned; then execution is immediately halted. When you call next() again on the generator, execution continues until the next yield statement is encountered, which returns a value, then execution halts again.
Python yield (migrating from Ruby): How can I write a function without arguments and only with yield to do prints?
yield
in Ruby and yield
in Python are two very different things.
In Ruby yield
runs a block passed as a parameter to the function.
Ruby:
def three
yield
yield
yield
end
three { puts 'hello '} # runs block (prints "hello") three times
In Python yield
throws a value from a generator (which is a function that uses yield
) and stops execution of the function. So it's something completely different, more likely you want to pass a function as a parameter to the function in Python.
Python:
def three(func):
func()
func()
func()
three(lambda: print('hello')) # runs function (prints "hello") three times
Python Generators
The code below (code you've provided) is a generator which returns None
three times:
def three():
yield
yield
yield
g = three() #=> <generator object three at 0x7fa3e31cb0a0>
next(g) #=> None
next(g) #=> None
next(g) #=> None
next(g) #=> StopIteration
The only way that I can imagine how it could be used for printing "Hello" three times -- using it as an iterator:
for _ in three():
print('Hello')
Ruby Analogy
You can do a similar thing in Ruby using Enumerator.new
:
def three
Enumerator.new do |e|
e.yield # or e << nil
e.yield # or e << nil
e.yield # or e << nil
end
end
g = three
g.next #=> nil
g.next #=> nil
g.next #=> nil
g.next #=> StopIteration
three.each do
puts 'Hello'
end
Does Ruby have something like Python's list comprehensions?
The common way in Ruby is to properly combine Enumerable and Array methods to achieve the same:
digits.product(chars).select{ |d, ch| d >= 2 && ch == 'a' }.map(&:join)
This is only 4 or so characters longer than the list comprehension and just as expressive (IMHO of course, but since list comprehensions are just a special application of the list monad, one could argue that it's probably possible to adequately rebuild that using Ruby's collection methods), while not needing any special syntax.
Ruby equivalent of Python's dict comprehension
Let's take a couple of steps back and ignore the specifics of Ruby and Python for now.
Mathematical set-builder notation
The concept of comprehension originally comes from mathematical set-builder notation, e.g. something like this: E = { n ∈ ℕ | 2∣n } which defines E to be the set of all even natural numbers, as does E = { 2n | n ∈ ℕ }.
List comprehensions in Programming Languages
This set-builder notation inspired similar constructs in many programming languages all the way back to 1969, although it wasn't until the 1970s that Phil Wadler coined the term comprehensions for these. List comprehensions ended up being implemented in Miranda in the early 1980s, which was a hugely influential programming language.
However, it is important to understand that these comprehensions do not add any new semantic features to the world of programming languages. In general, there is no program you can write with a comprehension that you cannot also write without. Comprehensions provide a very convenient syntax for expressing these kinds of transformations, but they don't do anything that couldn't also be achieved with the standard recursion patterns like fold, map, scan, unfold, and friends.
So, let's first look at how the various features of Python's comprehensions compare to the standard recursion patterns, and then see how those recursion patterns are available in Ruby.
Python
[Note: I will use Python list comprehension syntax here, but it doesn't really matter since list, set, dict comprehensions and generator expressions all work the same. I will also use the convention from functional programming to use single-letter variables for collection elements and the plural for collections, i.e. x
for an element and xs
for "a collection of x-es".]
Transforming each element the same way
[f(x) for x in xs]
This transforms each element of the original collection using a transformation function into a new element of a new collection. This new collection has the same number of elements as the original collection and there is a 1:1 correspondence between the elements of the original collection and the elements of the new collection.
One could say that each element of the original collection is mapped to an element of the new collection. Hence, this is typically called map in many programming languages, and in fact, it is called that in Python as well:
map(f, xs)
The same, but nested
Python allows you to have multiple for
/ in
s in a single comprehension. This is more or less equivalent to having nested mappings which then get flattened into a single collection:
[f(x, y) for x in xs for y in ys]
# or
[f(y) for ys in xs for y in ys]
This combination of mapping and then flattening the collection is commonly known as flatMap (when applied to collections) or bind (when applied to Monads).
Filtering
The last operation that Python comprehensions support is filtering:
[x for x in xs if p(x)]
This will filter the collection xs
into a collection which contains a subset of the original elements which satisfy the predicate p
. This operation is commonly known as filter.
Combine as you like
Obviously, you can combine all of these, i.e. you can have a comprehension with multiple nested generators that filter out some elements and then transform them.
Ruby
Ruby also provides all of the recursion patterns (or collection operations) mentioned above, and many more. In Ruby, an object that can be iterated over, is called an enumerable, and the Enumerable
mixin in the core library provides a lot of useful and powerful collection operations.
Ruby was originally heavily inspired by Smalltalk, and some of the older names of Ruby's original collection operations still go back to this Smalltalk heritage. In the Smalltalk collections framework, there is an in-joke about all the collections methods rhyming with each other, thus, the fundamental collections method in Smalltalk are called [listed here with their more standard equivalents from functional programming]:
collect:
, which "collects" all elements returned from a block into a new collection, i.e. this is the equivalent to map.select:
, which "selects" all elements that satisfy a block, i.e. this is the equivalent to filter.reject:
, which "rejects" all elements that satisfy a block, i.e. this is the opposite ofselect:
and thus equivalent to what is sometimes called filterNot.detect:
, which "detects" whether an element which satisfies a block is inside the collection, i.e. this is the equivalent to contains. Except, it actually returns the element as well, so it is more like findFirst.inject:into:
… where the nice naming schema breaks down somewhat …: it does "inject" a starting value "into" a block but that's a somewhat strained connection to what it actually does. This is the equivalent to fold.
So, Ruby has all of those, and more, and it uses some of the original naming, but thankfully, it also provides aliases.
Map
In Ruby, map is originally named Enumerable#collect
but is also available as Enumerable#map
, which is the name preferred by most Rubyists.
As mentioned above, this is also available in Python as the map
built-in function.
FlatMap
In Ruby, flatMap is originally named Enumerable#collect_concat
but is also available as Enumerable#flat_map
, which is the name preferred by most Rubyists.
Filter
In Ruby, filter is originally named Enumerable#select
, which is the name preferred by most Rubyists, but is also available as Enumerable#find_all
.
FilterNot
In Ruby, filterNot is named Enumerable#reject
.
FindFirst
In Ruby, findFirst is originally named Enumerable#detect
, but is also available as Enumerable#find
.
Fold
In Ruby, fold is originally named Enumerable#inject
, but is also available as Enumerable#reduce
.
It also exists in Python as functools.reduce
.
Unfold
In Ruby, unfold is named Enumerator::produce
.
Scan
Scan is unfortunately not available in Ruby. It is available in Python as itertools.accumulate
.
A deep dive into recursion patterns
Armed with our nomenclature from above, we now know that what you wrote is called a fold:
squares = original.inject ({}) do |squared, (name, value)|
squared[name] = value ** 2
squared
end
What you wrote here works. And that sentence I just wrote is actually surprisingly deep! Because fold has a very powerful property: everything which can be expressed as iterating over a collection can be expressed as a fold. In other words, everything that can be expressed as recursing over a collection (in a functional language), everything that can be expressed as looping / iterating over a collection (in an imperative language), everything that can be expressed using any of the afore-mentioned functions (map, filter, find), everything that can be expressed using Python's comprehensions, everything that can be expressed using some of the additional functions we haven't discussed yet (e.g. groupBy) can by expressed using fold.
If you have fold, you don't need anything else! If you were to remove every method from Enumerable
except Enumerable#inject
, you could still write everything you could write before; you could actually re-implement all the methods you just removed only by using Enumerable#inject
. In fact, I did that once for fun as an exercise. You could also implement the missing scan operation mentioned above.
It is not necessarily obvious that fold really is general, but think of it this way: a collection can be either empty or not. fold has two arguments, one which tells it what to do when the collection is empty, and one which tells it what to do when the collection is not empty. Those are the only two cases, so every possible case is handled. Therefore, fold can do everything!
Or a different viewpoint: a collection is a stream of instructions, either the EMPTY
instruction or the ELEMENT(value)
instruction. fold is a skeleton interpreter for that instruction set, and you as a programmer can supply the implementation for the interpretation of both those instructions, namely the two arguments to fold are the interpretation of those instructions. [I was introduced to this eye-opening interpretation of fold as an interpreter and a collection as an instruction stream is due to Rúnar Bjarnason, co-author of Functional Programming in Scala and co-designer of the Unison Programming Language. Unfortunately, I cannot find the original talk anymore, but The Interpreter Pattern Revisited presents a much more general idea that should also bring it across.]
Note that the way you are using fold here is somewhat awkward, because you are using mutation (i.e. a side-effect) for an operation that is deeply rooted in functional programming. Fold uses the return value of one iteration as the starting value for the next iteration. But the operation you are doing is a mutation which doesn't actually return a useful value for the next iteration. That's why you have to then return the accumulator which you just modified.
If you were to express this in a functional way using Hash#merge
, without mutation, it would look cleaner:
squares = original.inject ({}) do |squared, (name, value)|
squared.merge({ name => value ** 2})
end
However, for the specific use-case where instead of returning a new accumulator on each iteration and using that for the next iteration, you want to just mutate the same accumulator over and over again, Ruby offers a different variant of fold under the name Enumerable#each_with_object
, which completely ignores the return value of the block and just passes the same accumulator object every time. Confusingly, the order of the arguments in the block is reversed between Enumerable#inject
(accumulator first, element second) and Enumerable#each_with_object
(element first, accumulator second):
squares = original.each_with_object ({}) do |(name, value), squared|
squared[name] = value ** 2}
end
However, it turns out, we can make this even simpler. I explained above that fold is general, i.e. it can solve every problem. Then why do we have those other operations in the first place? We have them for the same reason that we have subroutines, conditionals, exceptions, and loops, even though we could do everything with just GOTO
: expressivity.
If you read some code using only GOTO
, you have to "reverse engineer" what every particular usage of GOTO
means: is it checking a condition, is it doing something multiple times? By having different, more specialized constructs, you can recognize at a glance what a particular piece of code does.
The same applies to these collection operations. In your case, for example, you are transforming each element of the original collection into a new element of the result collection. But, you have to actually read and understand what the block does, in order to recognize this.
However, as we discussed above, there is a more specialized operation available which does this: map. And everybody who sees map immediately understands "oh, this is mapping each element 1:1 to a new element", without having to even look at what the block does. So, we can write your code like this instead:
squares = original.map do |name, value|
[name, value ** 2]
end.to_h
Note: Ruby's collection operations are for the most part not type-preserving, i.e. transforming a collection will typically not yield the same type of collection. Instead, in general, collection operations mostly return Array
s, which is why we have to call Array#to_h
here at the end.
As you can see, because this operation is more specialized than fold (which can do everything), it is both simpler to read and also simpler to write (i.e. the inside of the block, the part that you as the programmer have to write, is simpler than what you had above).
But we are actually not done! It turns out that for this particular case, where we only want to transform the values of a Hash
, there is actually an even more specialized operation available: Hash#transform_values
:
squares = original.transform_values do |value|
value ** 2
end
Epilogue
One of the things programmers do most often is iterate over collections. Practically every program ever written in any programming language odes this in some form or another. Therefore, it is very valuable to study the operations your particular programming language offers for doing this.
In Ruby, this means studying the Enumerable
mixin as well as the additional methods provided by Array
and Hash
.
Also, study Enumerator
s and how to combine them.
But it is also very helpful to study the history of where these operations come from, which is mostly functional programming. If you understand the history of those operations, you will be able to quickly familiarize yourself with collection operations in many languages, since they all borrow from that same history, e.g. ECMAScript, Python, .NET LINQ, Java Streams, C++ STL algorithms, Swift, and many more.
What does Ruby have that Python doesn't, and vice versa?
You can have code in the class definition in both Ruby and Python. However, in Ruby you have a reference to the class (self). In Python you don't have a reference to the class, as the class isn't defined yet.
An example:
class Kaka
puts self
end
self in this case is the class, and this code would print out "Kaka". There is no way to print out the class name or in other ways access the class from the class definition body in Python.
Related Topics
What's the Difference Between "Includes" and "Preload" in an Activerecord Query
Should One Use Dashes or Underscores When Naming a Gem with More Than One Word
Form Submitted Twice, Due to :Remote=>True
Rails 3:How to Generate Models for Existing Database Tables
How to Override a Column in Rails Model
Importing CSV Quoting Error Is Driving Me Nuts
How to Parse JSON Request Body in Sinatra Just Once and Expose It to All Routes
Undef - Why Would You Want to Undefine a Method in Ruby
How to Save Values into a Yaml File
What Is '$:.Unshift File.Dirname(_File_)' Doing
How to Spawn a Child Process in Ruby
How to Use Watir::Waiter::Wait_Until to Force Chrome to Wait
Heroku Rails 4 Could Not Connect to Server: Connection Refused