How can I make a ruby enumerator that does lazy iteration through two other enumerators?
This seems to work just how I want;
enums.lazy.flat_map{|enum| enum.lazy }
Here's the demonstration. Define these yielding methods with side-effects;
def test_enum
return enum_for __method__ unless block_given?
puts 'hi'
yield 1
puts 'hi again'
yield 2
end
def test_enum2
return enum_for __method__ unless block_given?
puts :a
yield :a
puts :b
yield :b
end
concated_enum = [test_enum, test_enum2].lazy.flat_map{|en| en.lazy }
Then call next on the result, showing that the side effects happen lazily;
[5] pry(main)> concated_enum.next
hi
=> 1
[6] pry(main)> concated_enum.next
hi again
=> 2
Built in way to concatenate two Enumerators
You can define a new enumerator, iterating through your existing enumerators. Something like:
enum = Enumerator.new { |y|
enum1.each { |e| y << e }
enum2.each { |e| y << e }
}
How can I create an enumerator that does certain things after iteration?
You could do something like this.
def foo a, &pr
if pr
a.map(&pr).join
else
o = Object.new
o.instance_variable_set :@a, a
def o.each *y
foo @a.map { |z| yield z, *y } { |e| e }
end
o.to_enum
end
end
Then we have
enum = foo([1,2,3])
enum.each { |x| 2 * x } # "246"
or
enum = foo([1,2,3])
enum.with_index { |x, i| x * i } # "026"
Inspiration was drawn from the Enumerator documentation. Note that all of your expectations about enumerators like you asked for hold, because .to_enum
takes care of all that. enum
is now a legitimate Enumerator
!
enum.class # Enumerator
Ruby Enumerator-based lazy flatten method
- This doesn't seem lazy to me, as you are still performing old (non-lazy)
flatten
beneath. Enumerator
isEnumerable
, so I think you don't need to handle it separately.- I would expect
lazy_flatten
to be method onEnumerable
.
Here's how I would implement it:
module Enumerable
def lazy_flatten
Enumerator.new do |yielder|
each do |element|
if element.is_a? Enumerable
element.lazy_flatten.each do |e|
yielder.yield(e)
end
else
yielder.yield(element)
end
end
end
end
end
Enumerator as an infinite generator in Ruby
I think I've found something that you may find interesting.
This article: 'Ruby 2.0 Works Hard So You Can Be Lazy' by Pat Shaughnessy explains the ideas behind Eager and Lazy evaluation, and also explains how that relates to the "framework classes" like Enumerale, Generator or Yielder. It is mostly focused on explaining how to achieve LazyEvaluation, but still, it's quite detailed.
Original Source: 'Ruby 2.0 Works Hard So You Can Be Lazy' by Pat Shaughnessy
Ruby 2.0 implements lazy evaluation using an object called Enumerator::Lazy. What makes this special is that it plays both roles! It is an enumerator, and also contains a series of Enumerable methods. It calls each to obtain data from an enumeration source, and it yields data to the rest of an enumeration.
Since Enumerator::Lazy plays both roles, you can chain them up together to produce a single enumeration.
This is the key to lazy evaluation in Ruby. Each value from the data source is yielded to my block, and then the result is immediately passed along down the enumeration chain. This enumeration is not eager – the Enumerator::Lazy#collect method does not collect the values into an array. Instead, each value is passed one at a time along the chain of Enumerator::Lazy objects, via repeated yields. If I had chained together a series of calls to collect or other Enumerator::Lazy methods, each value would be passed along the chain from one of my blocks to the next, one at a time
Enumerable#first both starts the iteration by calling each on the lazy enumerators, and ends the iteration by raising an exception when it has enough values.
At the end of the day, this is the key idea behind lazy evaluation: the function or method at the end of a calculation chain starts the execution process, and the program’s flow works backwards through the chain of function calls until it obtains just the data inputs it needs. Ruby achieves this using a chain of Enumerator::Lazy objects.
How to use an enumerator
The main distinction between an Enumerator
and most† other data structures in the Ruby core library (Array
, Hash
) and standard library (Set
, SortedSet
) is that an Enumerator
can be infinite. You cannot have an Array
of all even numbers or a stream of zeroes or all prime numbers, but you can definitely have such an Enumerator
:
evens = Enumerator.new do |y|
i = -2
y << i += 2 while true
end
evens.take(10)
# => [0, 2, 4, 6, 8, 10, 12, 14, 16, 18]
zeroes = [0].cycle
zeroes.take(10)
# => [0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
So, what can you do with such an Enumerator
? Well, three things, basically.
Enumerator
mixes inEnumerable
. Therefore, you can use allEnumerable
methods such asmap
,inject
,all?
,any?
,none?
,select
,reject
and so forth. Just be aware that anEnumerator
may be infinite whereasmap
returns anArray
, so trying tomap
an infiniteEnumerator
may create an infinitely largeArray
and take an infinite amount of time.There are wrapping methods which somehow "enrich" an
Enumerator
and return a newEnumerator
. For example,Enumerator#with_index
adds a "loop counter" to the block andEnumerator#with_object
adds a memo object.You can use an
Enumerator
just like you would use it in other languages for external iteration by using theEnumerator#next
method which will give you either the next value (and move theEnumerator
forward) orraise
aStopIteration
exception if theEnumerator
is finite and you have reached the end.
† Eg., an infinite range: (1..1.0/0)
What's the best way to return an Enumerator::Lazy when your class doesn't define #each?
I think you should return a normal Enumerator
using to_enum
:
class Calendar
# ...
def each_from(first)
return to_enum(:each_from, first) unless block_given?
loop do
yield first if include?(first)
first += step
end
end
end
This is what most rubyists would expect. Even though it's an infinite Enumerable
, it is still usable, for example:
Calendar.new.each_from(1.year.from_now).first(10) # => [...first ten dates...]
If they actually need a lazy enumerator, they can call lazy
themselves:
Calendar.new.each_from(1.year.from_now)
.lazy
.map{...}
.take_while{...}
If you really want to return a lazy enumerator, you can call lazy
from you method:
# ...
def each_from(first)
return to_enum(:each_from, first).lazy unless block_given?
#...
I would not recommend it though, since it would be unexpected (IMO), could be an overkill and will be less performant.
Finally, there are a couple of misconceptions in your question:
All methods of
Enumerable
assume aneach
, not justlazy
.You can define an
each
method that requires a parameter if you like and includeEnumerable
. Most methods ofEnumerable
won't work, buteach_with_index
and a couple of others will forward arguments so these would be usable immediately.The
Enumerator.new
without a block is gone becauseto_enum
is what one should use. Note that the block form remains. There's also a constructor forLazy
, but it's meant to start from an existingEnumerable
.You state that
to_enum
never creates a lazy enumerator, but that's not entirely true.Enumerator::Lazy#to_enum
is specialized to return a lazy enumerator. Any user method onEnumerable
that callsto_enum
will keep a lazy enumerator lazy.
Enumerator::Lazy and Garbage Collection
When you iterate over a plain old array, the garbage collector has no chance to do anything.
You can help the garbage collector by writing nil into the array position after you no longer need the element, so that the object in this position may now be free for collection.
When you correctly use lazy enumerator, you are not iterate over an array of hashes. Instead you enumerate over the hashes, handling one after the other, and each one is read on demand.
So you have the chance to use much less memory (depending on your further processing, and that it does not hold the objects in memory anyway)
the structure may look like this:
enum = Enumerator.new do |yielder|
csv.read(...) do
...
yielder.yield hash
end
end
enum.lazy.map{|hash| do_something(hash); nil}.count
You also need to make sure that you are not generate the array again in the last step of the chain.
Related Topics
What Is the Advantage of Creating an Enumerable Object Using To_Enum in Ruby
Behaviour of Array Bang Methods
How to Generate a Unique Request Id in Rails
Given a Url, How to Get Just the Domain
Using Form_For Tag with Get Method
Ruby Strftime '%Z' Method Returns '0545' Instead of 'Npt'
How to Install Libksba on MAC Osx
Ruby, How to Access Local Variables Outside the Do - End Loop
Why Are My Rspec Tests Failing, But My App Is Working
Could Not Find Gem 'Logstash-Devutils (>= 0) Ruby' in Any of the Gem Sources
How Is Ruby Tcpsocket Timeout Defined
How to Render a Layout Directly from Routes.Rb, Without a Controller
Rake Test Very Slow in Windows
Unexpected Keyword_End, Expecting $End (Syntaxerror)
How to Change Environment Variables When Running Rspec for Ruby
Using Ruby, What Is the Most Efficient Way to Get the Content Type of a Given Url