Help understanding yield and enumerators in Ruby
The Enumerator::Yielder#yield
method and the Enumerator::Yielder::<<
method are exactly the same. In fact, they are aliases.
So, which one of those two you use, is 100% personal preference, just like Enumerable#collect
and Enumerable#map
or Enumerable#inject
and Enumerable#reduce
.
ruby methods that either yield or return Enumerator
The core libraries insert a guard return to_enum(:name_of_this_method, arg1, arg2, ..., argn) unless block_given?
. In your case:
class Array
def double
return to_enum(:double) unless block_given?
each { |x| yield 2*x }
end
end
>> [1, 2, 3].double { |x| puts(x) }
2
4
6
>> ys = [1, 2, 3].double.select { |x| x > 3 }
#=> [4, 6]
Enumerator new and yield
Consider the following code:
fib = Enumerator.new do |y|
puts "Enter enumerator"
a = b = 1
loop do
puts "Inside loop"
y << a
puts "y: #{y.inspect}, a: #{a}, b: #{b}"
a, b = b, a + b
end
end
puts fib.take(5)
It prints:
# Enter enumerator
# Inside loop
# y: #<Enumerator::Yielder:0x000000059a27e8>, a: 1, b: 1
# Inside loop
# y: #<Enumerator::Yielder:0x000000059a27e8>, a: 1, b: 2
# Inside loop
# y: #<Enumerator::Yielder:0x000000059a27e8>, a: 2, b: 3
# Inside loop
# y: #<Enumerator::Yielder:0x000000059a27e8>, a: 3, b: 5
# Inside loop
# 1
# 1
# 2
# 3
# 5
Apparently, this output actually gives hints on all the question you’ve stated. Note, that we entered a yielder only once. Let’s dig into:
Why loop
is infinite?
Because Fibonacci’s number sequence is infinite. This enumerator is intended to be used with Enumerable#take
(see an example above.)
What is Enumerator::Yielder
?
It is an abstraction. It’s method yield
actually calls back the block of callee, passing a parameter as block parameters.
What does <<
method does?
Yields once. In other words, it calls back the caller code, passing it’s parameter to the caller’s block. In this particular example, it will call back each
block, passing a
from Yielder
instance as block parameter (e
as I named it there.)
Why does infinite loop happen when y << a
is removed?
Because there are no yield
s happened. In my example, the callee will stop after yielding five (5 as parameter of take
) times.
Enumerator yielder.yield VS Proc.yield
You are confusing Ruby's yield statement with Enumerator::Yielder's yield method and Proc's yield method. They may be spelled the same but they are completely different.
Statement
The yield statement has no receiver. Inside a method it means "Run the block right now". An error occurs if no block is attached. It is not always given an argument, because sometimes you just want to run the block.
def foo
yield :bar
end
foo # LocalJumpError
foo { |x| puts x } # bar
Enumerator::Yielder
For a yielder, yield
is almost always given an argument. That's because it means the same as <<
which is "The next time someone calls next
on me, give them this value".
Enumerator.new { |yielder| yielder.yield 3 }.next # 3
Enumerator.new { |yielder| yielder << 3 }.next # same thing
I think it's a good idea to use <<
to avoid confusion with the yield statement.
Proc
Procs and lambdas are basically functions. yield
here means the same thing as call
, which "Just call the function". You can give it an argument or not, depending on how the proc was defined. Nothing fancy here.
proc { |x| puts x }.yield(:bar) # bar
proc { |x| puts x }.call(:bar) # same thing as previous line
I think it's a good idea to use call
to avoid confusion with the yield statement.
How does Ruby Enumerators chaining work exactly?
Todd's answer is excellent, but I feel like seeing some more Ruby code might be beneficial. Specifically, let's try to write each
and map
on Array
ourselves.
I won't use any Enumerable
or Enumerator
methods directly, so we see how it's all working under the hood (I'll still use for
loops, and those technically call #each
under the hood, but that's only cheating a little)
First, there's each
. each
is easy. It iterates over the array and applies a function to each element, before returning the original array.
def my_each(arr, &block)
for i in 0..arr.length-1
block[arr[i]]
end
arr
end
Simple enough. Now what if we don't pass a block. Let's change it up a bit to support that. We effectively want to delay the act of doing the each
to allow the Enumerator
to do its thing
def my_each(arr, &block)
if block
for i in 0..arr.length-1
block[arr[i]]
end
arr
else
Enumerator.new do |y|
my_each(arr) { |*x| y.yield(*x) }
end
end
end
So if we don't pass a block, we make an Enumerator
that, when consumed, calls my_each
, using the enumerator yield object as a block. The y
object is a funny thing but you can just think of it as basically being the block you'll eventually pass in. So, in
my_each([1, 2, 3]).with_index { |x, i| x * i }
Think of y
as being like the { |x, i| x * i }
bit. It's a bit more complicated than that, but that's the idea.
Incidentally, on Ruby 2.7 and later, the Enumerator::Yielder
object got its own #to_proc
, so if you're on a recent Ruby version, you can just do
Enumerator.new do |y|
my_each(arr, &y)
end
rather than
Enumerator.new do |y|
my_each(arr) { |*x| y.yield(*x) }
end
Now let's extend this approach to map
. Writing map
with a block is easy. It's just like each
but we accumulate the results.
def my_map(arr, &block)
result = []
for i in 0..arr.length-1
result << block[arr[i]]
end
result
end
Simple enough. Now what if we don't pass a block? Let's do the exact same thing we did for my_each
. That is, we're just going to make an Enumerator
and, inside that Enumerator
, we call my_map
.
def my_map(arr, &block)
if block
result = []
for i in 0..arr.length-1
result << block[arr[i]]
end
result
else
Enumerator.new do |y|
my_map(arr) { |*x| y.yield(*x) }
end
end
end
Now, the Enumerator
knows that, whenever it eventually gets a block, it's going to use my_map
on that block at the end. We can see that these two functions actually behave, on arrays, like map
and each
do
my_each([1, 2, 3]).with_index { |x, i| x * i } # [1, 2, 3]
my_map ([1, 2, 3]).with_index { |x, i| x * i } # [0, 2, 6]
So your intuition was spot on
map
seems to carry the information that a function has to be applied, on top of carrying the data to iterate over. How does that work?
That's exactly what it does. map
creates an Enumerator
whose block knows to call map
at the end, whereas each
does the same but with each
. Of course, in reality, all of this is implemented in C for efficiency and bootstrapping reasons, but the fundamental idea is still there.
Related Topics
Strong Parameters Require Multiple
Get Server File Path with Paperclip
Capistrano 3 Execute Within a Directory
How to Alias a Class Method in Rails Model
Ruby: "&& Return" VS "And Return"
Ruby $Stdin.Gets Without Showing Chars on Screen
Ruby: How to Generate CSV Files That Has Excel-Friendly Encoding
Clarification on the Ruby << Operator
How to Get Ruby to Parse Time as If It Were in a Different Time Zone
How to Remove Repeated Spaces in a String