Ruby enumerator chaining
You might find it useful to break these expressions down and use IRB or PRY to see what Ruby is doing. Let's start with:
[1,2,3].each_with_index.map { |i,j| i*j }
Let
enum1 = [1,2,3].each_with_index
#=> #<Enumerator: [1, 2, 3]:each_with_index>
We can use Enumerable#to_a (or Enumerable#entries) to convert enum1
to an array to see what it will be passing to the next enumerator (or to a block if it had one):
enum1.to_a
#=> [[1, 0], [2, 1], [3, 2]]
No surprise there. But enum1
does not have a block. Instead we are sending it the method Enumerable#map:
enum2 = enum1.map
#=> #<Enumerator: #<Enumerator: [1, 2, 3]:each_with_index>:map>
You might think of this as a sort of "compound" enumerator. This enumerator does have a block, so converting it to an array will confirm that it will pass the same elements into the block as enum1
would have:
enum2.to_a
#=> [[1, 0], [2, 1], [3, 2]]
We see that the array [1,0]
is the first element enum2
passes into the block. "Disambiguation" is applied to this array to assign the block variables the values:
i => 1
j => 0
That is, Ruby is setting:
i,j = [1,0]
We now can invoke enum2
by sending it the method each
with the block:
enum2.each { |i,j| i*j }
#=> [0, 2, 6]
Next consider:
[1,2,3].map.each_with_index { |i,j| i*j }
We have:
enum3 = [1,2,3].map
#=> #<Enumerator: [1, 2, 3]:map>
enum3.to_a
#=> [1, 2, 3]
enum4 = enum3.each_with_index
#=> #<Enumerator: #<Enumerator: [1, 2, 3]:map>:each_with_index>
enum4.to_a
#=> [[1, 0], [2, 1], [3, 2]]
enum4.each { |i,j| i*j }
#=> [0, 2, 6]
Since enum2
and enum4
pass the same elements into the block, we see this is just two ways of doing the same thing.
Here's a third equivalent chain:
[1,2,3].map.with_index { |i,j| i*j }
We have:
enum3 = [1,2,3].map
#=> #<Enumerator: [1, 2, 3]:map>
enum3.to_a
#=> [1, 2, 3]
enum5 = enum3.with_index
#=> #<Enumerator: #<Enumerator: [1, 2, 3]:map>:with_index>
enum5.to_a
#=> [[1, 0], [2, 1], [3, 2]]
enum5.each { |i,j| i*j }
#=> [0, 2, 6]
To take this one step further, suppose we had:
[1,2,3].select.with_index.with_object({}) { |(i,j),h| ... }
We have:
enum6 = [1,2,3].select
#=> #<Enumerator: [1, 2, 3]:select>
enum6.to_a
#=> [1, 2, 3]
enum7 = enum6.with_index
#=> #<Enumerator: #<Enumerator: [1, 2, 3]:select>:with_index>
enum7.to_a
#=> [[1, 0], [2, 1], [3, 2]]
enum8 = enum7.with_object({})
#=> #<Enumerator: #<Enumerator: #<Enumerator: [1, 2, 3]:
# select>:with_index>:with_object({})>
enum8.to_a
#=> [[[1, 0], {}], [[2, 1], {}], [[3, 2], {}]]
The first element enum8
passes into the block is the array:
(i,j),h = [[1, 0], {}]
Disambiguation is then applied to assign values to the block variables:
i => 1
j => 0
h => {}
Note that enum8
shows an empty hash being passed in each of the three elements of enum8.to_a
, but of course that's only because Ruby doesn't know what the hash will look like after the first element is passed in.
How does Ruby Enumerators chaining work exactly?
Todd's answer is excellent, but I feel like seeing some more Ruby code might be beneficial. Specifically, let's try to write each
and map
on Array
ourselves.
I won't use any Enumerable
or Enumerator
methods directly, so we see how it's all working under the hood (I'll still use for
loops, and those technically call #each
under the hood, but that's only cheating a little)
First, there's each
. each
is easy. It iterates over the array and applies a function to each element, before returning the original array.
def my_each(arr, &block)
for i in 0..arr.length-1
block[arr[i]]
end
arr
end
Simple enough. Now what if we don't pass a block. Let's change it up a bit to support that. We effectively want to delay the act of doing the each
to allow the Enumerator
to do its thing
def my_each(arr, &block)
if block
for i in 0..arr.length-1
block[arr[i]]
end
arr
else
Enumerator.new do |y|
my_each(arr) { |*x| y.yield(*x) }
end
end
end
So if we don't pass a block, we make an Enumerator
that, when consumed, calls my_each
, using the enumerator yield object as a block. The y
object is a funny thing but you can just think of it as basically being the block you'll eventually pass in. So, in
my_each([1, 2, 3]).with_index { |x, i| x * i }
Think of y
as being like the { |x, i| x * i }
bit. It's a bit more complicated than that, but that's the idea.
Incidentally, on Ruby 2.7 and later, the Enumerator::Yielder
object got its own #to_proc
, so if you're on a recent Ruby version, you can just do
Enumerator.new do |y|
my_each(arr, &y)
end
rather than
Enumerator.new do |y|
my_each(arr) { |*x| y.yield(*x) }
end
Now let's extend this approach to map
. Writing map
with a block is easy. It's just like each
but we accumulate the results.
def my_map(arr, &block)
result = []
for i in 0..arr.length-1
result << block[arr[i]]
end
result
end
Simple enough. Now what if we don't pass a block? Let's do the exact same thing we did for my_each
. That is, we're just going to make an Enumerator
and, inside that Enumerator
, we call my_map
.
def my_map(arr, &block)
if block
result = []
for i in 0..arr.length-1
result << block[arr[i]]
end
result
else
Enumerator.new do |y|
my_map(arr) { |*x| y.yield(*x) }
end
end
end
Now, the Enumerator
knows that, whenever it eventually gets a block, it's going to use my_map
on that block at the end. We can see that these two functions actually behave, on arrays, like map
and each
do
my_each([1, 2, 3]).with_index { |x, i| x * i } # [1, 2, 3]
my_map ([1, 2, 3]).with_index { |x, i| x * i } # [0, 2, 6]
So your intuition was spot on
map
seems to carry the information that a function has to be applied, on top of carrying the data to iterate over. How does that work?
That's exactly what it does. map
creates an Enumerator
whose block knows to call map
at the end, whereas each
does the same but with each
. Of course, in reality, all of this is implemented in C for efficiency and bootstrapping reasons, but the fundamental idea is still there.
chaining ruby enumerator functions in a clean way
As a number of people in the comments suggest, this is really a matter of style; that being said, I have to agree with the comments within the code and say that you want to avoid method chaining at the end of a do..end.
If you're going to split methods by line, use a do..end. {}
and do...end are synonymous, as you know, but the braces are more often used (in my experience) for single-line pieces of code, and as 'mu is too short' pointed out, if you're set on using them, you may want to look into lambdas. But I'd stick to do..end in this case.
A general style rule I was taught that I follow is to split up chains if what is being worked with changes class in a way that might not be intuitive. ex: fizz = "buzz".split.reverse
breaks up a string into an array, but it's clear what the code is doing.
In the example you provided, there's a lot going on that's a bit hard to follow; I like that you wrote out the group_by using hash notation in the last example because it's clear what the group_by is sorting by there and what the output is - I'd put it in a [well named] variable of its own.
grouped_by_month = movies.groupBy({m -> m.release_date.strftime("%B")})
count_by_month = grouped_by_month.map{|month, list| [month, list.size]}.sort_by(&:last).reverse
This splits up the code into one line that sets up the grouping hash and another line that manipulates it.
Again, this is style, so everyone has their own quirks; this is simply how I'd edit this based off a quick glance. You seem to be getting into Ruby quite well overall! Sometimes I just like the look of a chain of methods on one line, even if its against best practices (and I'm doing Project Euler or some other project of my own). I'd suggest looking at large projects on Github (ex: rails) to get a feel for how those far more experienced than myself write clean code. Good luck!
Ruby: Enumerator Chain
Perhaps you need a map
which you would like to apply only on the leafs:
module Enumerable
def nested_map &block
map{|e|
case e
when Enumerable
e.nested_map(&block)
else
block.call(e)
end
}
end
end
p [[1,2], [3,4]].nested_map(&:succ)
#=> [[2, 3], [4, 5]]
or a map
which would apply only on n
-th level of the nested structure.
module Enumerable
def deep_map level, &block
if level == 0
map(&block)
else
map{|e| e.deep_map(level - 1, &block)}
end
end
end
p [[1,2], [3,4]].deep_map(1, &:succ)
#=> [[2, 3], [4, 5]]
Ruby enumerator with chaining
It's because the select
method is used for selection. It returns the element that satisfies the condition you placed in your block. In your case, you didn't put any condition so it returned nil
e.g.
You want to do something like this:
n = [1,2,3]
n.select { |num| num < 3 }
#=> This should return
#=> [1,2]
Ruby method chaining with an Enumerable class
Your Coffee
class defines method accessors for name
and strength
. For a single coffee
object, you can thus get the attributes with
coffee.name
# => "Laos"
coffee.strength
# => 10
In your Criteria#each
method, you try to access the attributes using the subscript operator, i.e. c[:strength]
(with c
being an Instance of Coffee
in this case). Now, on your Coffee class, you have not implemented the subscript accessor which resulting in the NoMethodError
you see there.
You could thus either adapt your Criteria#each
method as follows:
def each(&block)
@klass.collection.select { |c| c.strength == criteria[:strength] }.each(&block)
end
or you could implement the subscript operators on your Coffee
class:
class Coffee
attr_accessor :name
attr_accessor :strength
# ...
def [](key)
public_send(key)
end
def []=(key, value)
public_send(:"#{key}=", value)
end
end
Noe, as an addendum, you might want to extend your each
method in any case. A common (and often implicitly expected) pattern is that methods like each
return an Enumerator if no block was given. This allows patterns like CoffeeShop.strength(10).each.group_by(&:strength)
.
You can implement this b a simple on-liner in your method:
def each(&block)
return enum_for(__method__) unless block_given?
@klass.collection.select { |c| c.strength == criteria[:strength] }.each(&block)
end
Enumerator chain starting with find
Ok, I finally figured it out.
The reason find
does not terminate the iteration after the first value is processed by block, is that collect_i
iterator within collect
aka map
method of Enumerable module explicitly returns nil
after every iteration, no matter what was the returning value of a block provided with a call to map
or collect
. Here it is, taken from enum.c
:
static VALUE
collect_i(RB_BLOCK_CALL_FUNC_ARGLIST(i, ary))
{
rb_ary_push(ary, enum_yield(argc, argv));
return Qnil;
}
So internal call to find
on the initial array always gets nil
as a result of yielding a value, and thus doesn't stop iteration until the last element is processed. This is easy to prove by downloading ruby as archive and modifying this function like this:
static VALUE
collect_i(RB_BLOCK_CALL_FUNC_ARGLIST(i, ary))
{
rb_ary_push(ary, enum_yield(argc, argv));
return ary;
}
After saving and building ruby from modified source, we get this:
irb(main):001:0> [1,2,3,4,5].find.map { |x| x*x }
=> [1]
Another interesting thing, is that Rubinius implements collect
exactly like this, so I think there's a chance that MRI and Rubinius produce different results for this statement. I don't have a working Rubinius installation right now, but I will check for this when I'll have an oppoprtunity and update this post with result.
Not sure if this will ever be of any use to anyone, except maybe for satisfying one's curiosity, but still :)
Chaining enumerators that yield multiple arguments
From the discourse so far, it follows that we can analyze the source code, but we do not know the whys. Ruby core team is relatively very responsive. I recommend you to sign in at http://bugs.ruby-lang.org/issues/ and post a bug report there. They will surely look at this issue at most within a few weeks, and you can probably expect it corrected in the next minor version of Ruby. (That is, unless there is a design rationale unknown to us to keep things as they are.)
Related Topics
How to Make a Custom Environment in Rails a Default Environment
Access Translation File (I18N) from Inside Rails Model
How to Parse a HTML Table with Nokogiri
How Does Rake Db::Migrate Actually Work
Ruby Time.Parse Gives Me Out of Range Error
What Ide/Editor Do You Use for Ruby on Linux
Ruby on Rails Foreach with Bootstrap3 Row Class
Jekyll Templates Using Django-Like Liquid Blocks/Inheritance
Ruby/Rails Collection to Collection
Error Connecting to Redis on 127.0.0.1:6379 (Errno::Econnrefused) - Wercker
Raise Exception on Shell Command Failure
How to Replicate Class_Inheritable_Accessor's Behavior in Rails 3.1
Convert Named Matches in Matchdata to Hash
Heroku and @Font-Face - Embedded Fonts Wont Display on Heroku