What Is the Purpose of the Enumerator Class in Ruby

Why must we call to_a on an enumerator object?

The purpose of enumerators is lazy evaluation. When you call each_slice, you get back an enumerator object. This object does not calculate the entire grouped array up front. Instead, it calculates each “slice” as it is needed. This helps save on memory, and also allows you quite a bit of flexibility in your code.

This stack overflow post has a lot of information in it that you’ll find useful:

What is the purpose of the Enumerator class in Ruby

To give you a cut and dry answer to your question “Why must I call to_a when...”, the answer is, it hasn’t. It hasn’t yet looped through the array at all. So far it’s just defined an object that says that when it goes though the array, you’re going to want elements two at a time. You then have the freedom to either force it to do the calculation on all elements in the enumerable (by calling to_a), or you could alternatively use next or each to go through and then stop partway through (maybe calculate only half of them as opposed to calculating all of them and throwing the second half away).

It’s similar to how the Range class does not build up the list of elements in the range. (1..100000) doesn’t make an array of 100000 numbers, but instead defines an object with a min and max and certain operations can be performed on that. For example (1..100000).cover?(5) doesn’t build a massive array to see if that number is in there, but instead just sees if 5 is greater than or equal to 1 and less than or equal to 100000.

The purpose of this all is performance and flexibility.

It may be worth considering whether your implementation actually needs to make an array up front, or whether you can actually keep your RAM consumption down a bit by iterating over the enumerator. (If your real world scenario is as simple as you described, an enumerator won’t help much, but if the array actually is large, an enumerator could help you a lot).

In Ruby, what does it mean to return an 'enumerable'

An Enumerator abstracts the idea of enumeration so that you can use all of the handy Enumerable methods without caring what the underlying data structure is.

For example, you can use an enumerator to make an object that acts kind of like an infinite array:

squares = Enumerator.new do |yielder|
x = 1
loop do
yielder << x ** 2
x += 1
end
end

squares.take(10)
# [1, 4, 9, 16, 25, 36, 49, 64, 81, 100]
squares.count
# don't wait up for this one!

The cool thing about enumerators is that they are enumerable themselves and most Enumerable methods return enumerators if you don't give them a block. This is what allows you to chain method calls to get one big enumerator.

Here's how I would code each_with_index so that it can be chained nicely:

class Array
def my_each_with_index &blk
e = Enumerator.new do |yielder|
i = 0
each do |x|
yielder << [x, i]
i += 1
end
end

return e unless blk
e.each(&blk)
end
end

[3,2,1].my_each_with_index { |x, i| puts "#{i}: #{x}" }
# 0: 3
# 1: 2
# 3: 1

So first we create an enumerator which describes how to enumerate with indices. If no block is given, we simply return the enumerator. Otherwise we tell the enumerator to enumerate (which is what each does) using the block.

Ruby Enumerator class

As many other methods, Array#each returns and Enumerator if a block is not passed but it iterates over the array and calls the block for each item if a block is passed.

The values returned by the block for each array item are the elements of the array returned by Array#each when a block is passed.

To answer your question, the block and the Enumerator never met.

When is an enumerator useful?

Is this just some rule of ruby for easier chaining

Yes, this reason exactly. Easy chaining of enumerators. Consider this example:

ary = ['Alice', 'Bob', 'Eve']

records = ary.map.with_index do |item, idx|
{
id: idx,
name: item,
}
end

records # => [{:id=>0, :name=>"Alice"}, {:id=>1, :name=>"Bob"}, {:id=>2, :name=>"Eve"}]

map yields each element to with_index, which slaps item index on top of it and yields to your block. Block returns value to with_index, which returns to map which (does its thing, the mapping, and) returns to caller.

How to use an enumerator

The main distinction between an Enumerator and most other data structures in the Ruby core library (Array, Hash) and standard library (Set, SortedSet) is that an Enumerator can be infinite. You cannot have an Array of all even numbers or a stream of zeroes or all prime numbers, but you can definitely have such an Enumerator:

evens = Enumerator.new do |y|
i = -2
y << i += 2 while true
end

evens.take(10)
# => [0, 2, 4, 6, 8, 10, 12, 14, 16, 18]

zeroes = [0].cycle

zeroes.take(10)
# => [0, 0, 0, 0, 0, 0, 0, 0, 0, 0]

So, what can you do with such an Enumerator? Well, three things, basically.

  1. Enumerator mixes in Enumerable. Therefore, you can use all Enumerable methods such as map, inject, all?, any?, none?, select, reject and so forth. Just be aware that an Enumerator may be infinite whereas map returns an Array, so trying to map an infinite Enumerator may create an infinitely large Array and take an infinite amount of time.

  2. There are wrapping methods which somehow "enrich" an Enumerator and return a new Enumerator. For example, Enumerator#with_index adds a "loop counter" to the block and Enumerator#with_object adds a memo object.

  3. You can use an Enumerator just like you would use it in other languages for external iteration by using the Enumerator#next method which will give you either the next value (and move the Enumerator forward) or raise a StopIteration exception if the Enumerator is finite and you have reached the end.

Eg., an infinite range: (1..1.0/0)

Why return an enumerator?

This completely in accordance with the spirit of 1.9: to return enumerators whenever possible. String#bytes, String#lines, String#codepoints, but also methods like Array#permutation all return an enumerator.

In ruby 1.8 String#to_a resulted in an array of lines, but the method is gone in 1.9.

Why does Enumerable not have a length attribute in Ruby?

Enumerable has the count method, which is usually going to be the intuitive "length" of the enumeration.

But why not call it "length"? Well, because it operates very differently. In Ruby's built-in data structures like Array and Hash, length simply retrieves the pre-computed size of the data structure. It should always return instantly.

For Enumerable#count, however, there's no way for it to know what sort of structure it's operating on and thus no quick, clever way to get the size of the enumeration (this is because Enumerable is a module, and can be included in any class). The only way for it to get the size of the enumeration is to actually enumerate through it and count as it goes. For infinite enumerations, count will (appropriately) loop forever and never return.

difference between enumerable and iterator methods in ruby

Answer to your original question

[1,2,3].each.is_a?(Enumerable)
#=> true
[1,2,3].each.is_a?(Enumerator)
#=> true
[1,2,3].each.class.ancestors
#=> [Enumerator, Enumerable, Object, Kernel, BasicObject]

Yes, the "iterator" each returns an Enumerator when no block is used.

But if you're just learning Ruby and want to iterate over an Array/Range/Hash, just know that using each will cover most of your cases :

[1, 2, 3].each do |element|
puts element
end
# 1
# 2
# 3

('a'..'e').each do |element|
puts element
end
# a
# b
# c
# d
# e

{'a' => 1, 'b' => 2}.each do |key, value|
puts key
puts value
end
# a
# 1
# b
# 2

At your level, you shouldn't have to care where those methods are defined, for which class or module or how they're called.

Finally, for loops shouldn't be used in Ruby because they can show weird behaviours.

Your updated question

It's good that you made your question clearer. Note that the change might go unnoticed though, especially if you already accepted an answer.

3.times

3.times do |x|
puts x
end

enumerator = 3.times
enumerator.each do |x|
puts x
end

Used like this, both are perfectly equivalent. Since the second one is more verbose and enumerator probably isn't used anywhere else, there's no reason to use the second variant. enumerator is longer than 3.times anyway :)

Note that |x| should be on the same line as the block start. Rubocop could help you.

each_char

"scriptkiddie".each_char{|x| puts x}
"scriptkiddie".enum_for(:each_char).each{|x| puts x}

Again, no reason to use the 2nd variant if all you do is create an Enumerator and call each directly on it.

Why use Enumerator?

Chaining methods

One reason to use an Enumerator is to be able to chain Enumerable methods :

puts 3.times.cycle.first(7)
#=> [0, 1, 2, 0, 1, 2, 0]

or

"script".each_char.with_index{|c, i|
puts "letter #{i} : #{c}"
}
# letter 0 : s
# letter 1 : c
# letter 2 : r
# letter 3 : i
# letter 4 : p
# letter 5 : t

Infinite lists

Enumerators also make it possible to work with infinite lists.

require 'prime'

every_prime = Prime.each
p every_prime.first(20)
#=> [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71]

p every_prime.lazy.select{|x| x > 1000}.first(3)
#=> [1009, 1013, 1019]

Custom iteration

It's possible to define new Enumerators for custom iterations :

require 'date'

def future_working_days(start_date = Date.today)
date = start_date
Enumerator.new do |yielder|
loop do
date += 1
yielder << date unless date.saturday? || date.sunday?
end
end
end

puts future_working_days.take(3)
# 2017-02-01
# 2017-02-02
# 2017-02-03


Related Topics



Leave a reply



Submit