How can I create an enumerator that does certain things after iteration?
You could do something like this.
def foo a, &pr
if pr
a.map(&pr).join
else
o = Object.new
o.instance_variable_set :@a, a
def o.each *y
foo @a.map { |z| yield z, *y } { |e| e }
end
o.to_enum
end
end
Then we have
enum = foo([1,2,3])
enum.each { |x| 2 * x } # "246"
or
enum = foo([1,2,3])
enum.with_index { |x, i| x * i } # "026"
Inspiration was drawn from the Enumerator documentation. Note that all of your expectations about enumerators like you asked for hold, because .to_enum
takes care of all that. enum
is now a legitimate Enumerator
!
enum.class # Enumerator
How does Ruby's Enumerator object iterate externally over an internal iterator?
It's not exactly magic, but it is beautiful nonetheless. Instead of making a copy of some sort, a Fiber
is used to first execute each
on the target enumerable object. After receiving the next object of each
, the Fiber
yields this object and thereby returns control back to where the Fiber
was resumed initially.
It's beautiful because this approach doesn't require a copy or other form of "backup" of the enumerable object, as one could imagine obtaining by for example calling #to_a
on the enumerable. The cooperative scheduling with fibers allows to switch contexts exactly when needed without the need to keep some form of lookahead.
It all happens in the C code for Enumerator
. A pure Ruby version that would show roughly the same behavior could look like this:
class MyEnumerator
def initialize(enumerable)
@fiber = Fiber.new do
enumerable.each { |item| Fiber.yield item }
end
end
def next
@fiber.resume || raise(StopIteration.new("iteration reached an end"))
end
end
class MyEnumerable
def each
yield 1
yield 2
yield 3
end
end
e = MyEnumerator.new(MyEnumerable.new)
puts e.next # => 1
puts e.next # => 2
puts e.next # => 3
puts e.next # => StopIteration is raised
Enumerator without collection
With the current setup you can't pass any collection to it. You can't change the collection of the enumerator once instantiated.
The current code only works because is block is not instantly executed, therefore you see the error when you try to start iterating (or retrieving items).
enumerator = Enumerator.new(&:map)
enumerator.take(1)
# NoMethodError (undefined method `map' for #<Enumerator::Yielder:0x00000000055b6e90>)
This is because Enumerator::new
yields a Enumerator::Yielder
which doesn't has the method map
.
The above could also be written as:
enumerator = Enumerator.new { |yielder| yielder.map }
If you would like to create an enumerator from a collection the easiest way is to call each
without block. Other methods like map
also create enumerators without block given.
enumerator = [1, 2, 3].each
#=> #<Enumerator: [1, 2, 3]:each>
If you for some reason still want to create the enumerator by hand it could look like this:
enumerator = Enumerator.new { |yielder| [1, 2, 3].each { |number| yielder << number } }
If the intent was to preselect a method of iterating before you receive the collection, you can do so in the following manner:
# assuming both the collection and block are passed by the user
map = :map.to_proc
result = map.call(collection, &block)
# which is equivalent to
result = collection.map(&block)
How to use an enumerator
The main distinction between an Enumerator
and most† other data structures in the Ruby core library (Array
, Hash
) and standard library (Set
, SortedSet
) is that an Enumerator
can be infinite. You cannot have an Array
of all even numbers or a stream of zeroes or all prime numbers, but you can definitely have such an Enumerator
:
evens = Enumerator.new do |y|
i = -2
y << i += 2 while true
end
evens.take(10)
# => [0, 2, 4, 6, 8, 10, 12, 14, 16, 18]
zeroes = [0].cycle
zeroes.take(10)
# => [0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
So, what can you do with such an Enumerator
? Well, three things, basically.
Enumerator
mixes inEnumerable
. Therefore, you can use allEnumerable
methods such asmap
,inject
,all?
,any?
,none?
,select
,reject
and so forth. Just be aware that anEnumerator
may be infinite whereasmap
returns anArray
, so trying tomap
an infiniteEnumerator
may create an infinitely largeArray
and take an infinite amount of time.There are wrapping methods which somehow "enrich" an
Enumerator
and return a newEnumerator
. For example,Enumerator#with_index
adds a "loop counter" to the block andEnumerator#with_object
adds a memo object.You can use an
Enumerator
just like you would use it in other languages for external iteration by using theEnumerator#next
method which will give you either the next value (and move theEnumerator
forward) orraise
aStopIteration
exception if theEnumerator
is finite and you have reached the end.
† Eg., an infinite range: (1..1.0/0)
Is there a built-in way to check if #next or #peek will raise StopIteration?
You can rescue
the StopIteration
explicitly, but there's also the idea that the loop
method internally rescues a StopIteration
exception by simply exiting the loop. (Inside loop
, raise StopIteration
has the same effect as break
.)
This code simply exits the loop when you try to peek past the end:
a = %w(a b c d e).to_enum
loop do
print a.peek
a.next
end
The code outputs abcde
. (It also transparently raises and rescues StopIteration
.)
So, if you want to simply ignore the StopIteration
exception when you try to peek past the end, just use loop
.
Of course, once you peek past the end, you'll get dumped out of the loop. If you don't want that, you can use while
and rescue
to customize behavior. For example, if you want to avoid exiting if you peek past the end, and exit when you iterate past the end using next
, you could do something like this:
a = %w(a b c d e).to_enum
while true
begin
print a.peek
rescue StopIteration
print "\nTried to peek past the end of the enum.\nWe're gonna overlook that.\n"
end
x = a.next rescue $!
break if x.class == StopIteration
end
p 'All done!'
The last two lines in the loop do the same thing as this, which you could use instead:
begin
a.next
rescue StopIteration
break
end
A point to make is that handling StopIteration
is Ruby's intended way of dealing with getting to the end of an iterator. Quoting from Matz's book The Ruby Programming Language:
External iterators are quite simple to use: just call
next
each time you want another
element. When there are no more elements left,next
will raise aStopIteration
exception.
This may seem unusual—an exception is raised for an expected termination
condition rather than an unexpected and exceptional event. (StopIteration
is a descendant
ofStandardError
andIndexError
; note that it is one of the only exception
classes that does not have the word “error” in its name.) Ruby follows Python in this
external iteration technique. By treating loop termination as an exception, it makes
your looping logic extremely simple; there is no need to check the return value of
next
for a special end-of-iteration value, and there is no need to call some kind of
next?
predicate before callingnext
.
Ruby peek with include? acts like next
It's not the .include?
, it's how you get your enumerator (a new one each time). Observe:
@file.each_line.peek # => "Extension Date\n"
@file.each_line.peek # => "State\n"
@file.each_line.peek # => "CO\n"
@file.each_line.peek # => "COLORADO\n"
@file.each_line.peek # => "\n"
The problem here is that when each_line
is called, it reads a line. And since file position is maintained between invocations, the second time you call it, it reads one more line. And so on.
Get enumerator once and hold on to it.
enum = @file.each_line
enum.peek # => "Extension Date\n"
enum.peek # => "Extension Date\n"
enum.peek # => "Extension Date\n"
enum.peek # => "Extension Date\n"
enum.peek.include?('foo') # => false
enum.peek # => "Extension Date\n"
Related Topics
Symbol#To_Proc Shorthand with the Stabby Lambda Syntax
Grouping an Array by Comparing 2 Adjacent Elements
How to Emit Yaml in Ruby Expanding Aliases
How to Check If a Resource Exists in an Aws S3Bucket
Gems Not in Local Gems After Bundle Install
What's a Semantically-Correct Way to Parse CSV from SQL Server 2008
Cucumber: Automatic Step File Creation
Intelligently Generating Combinations of Combinations
Installing Ruby Using Rvm Fails, Without Trace
How to Fix Ruby Script Which Fails with Encoding Error: "\Xd8" on Us-Ascii
How to Create a "Clone"-Able Enumerator for External Iteration
Can't Find a Route with an Underscore or Doesn't Treat It Properly
Count Iteration on the Enumerable Cycle
Access 'Self' of an Object Through the Parameters
Gem Install Debugger -V '1.5.0' Fails