ruby methods that either yield or return Enumerator
The core libraries insert a guard return to_enum(:name_of_this_method, arg1, arg2, ..., argn) unless block_given?
. In your case:
class Array
def double
return to_enum(:double) unless block_given?
each { |x| yield 2*x }
end
end
>> [1, 2, 3].double { |x| puts(x) }
2
4
6
>> ys = [1, 2, 3].double.select { |x| x > 3 }
#=> [4, 6]
What is the proper way to use methods that return an enumerator in Ruby array?
From the docs to Enumerator
:
Most methods have two forms: a block form where the contents are
evaluated for each item in the enumeration, and a non-block form which
returns a new Enumerator wrapping the iteration.This allows you to chain Enumerators together. For example, you can
map a list’s elements to strings containing the index and the element
as a string via:puts %w[foo bar baz].map.with_index {|w,i| "#{i}:#{w}" }
# => ["0:foo", "1:bar", "2:baz"]
An Enumerator can also be used as an external iterator. For example,
Enumerator#next returns the next value of the iterator or raises
StopIteration if the Enumerator is at the end.e = [1,2,3].each # returns an enumerator object.
puts e.next # => 1
puts e.next # => 2
puts e.next # => 3
puts e.next # raises StopIteration
I'm sorry for copy-paste, but I couldn't explain better.
Why return an enumerator?
This completely in accordance with the spirit of 1.9: to return enumerators whenever possible. String#bytes, String#lines, String#codepoints, but also methods like Array#permutation all return an enumerator.
In ruby 1.8 String#to_a resulted in an array of lines, but the method is gone in 1.9.
Ruby: Is there something like Enumerable#drop that returns an enumerator instead of an array?
If you need it more than once, you could write an extension to Enumerator
.
class Enumerator
def enum_drop(n)
with_index do |val, idx|
next if n == idx
yield val
end
end
end
File.open(testfile).each_line.enum_drop(1) do |line|
print line
end
# prints lines #1, #3, #4, …
Get slice of an Enumerator effectively
An Enumerator
is very general interface, it makes only very simple assumptions about the "collection" it is traversing. In particular, it really only supports two operations: get the current element and iterate to the next element.
Given those two operations, if you want to get the 10 millionth element, there is only one thing you can do: iterate 10 million times. Which takes time.
There is no such thing as "slicing" an Enumerator
. An Enumerator
enumerates. That's it.
Now, as you discovered, there is another problem: Ruby's collection operations are not type-preserving. No matter what type of collection you call map
or select
or take
or whatever on, it will always return the same type: a fully realized, concrete, strict Array
. That's how most collection frameworks in most languages work, e.g. in .NET all collection operations return IEnumerable
. That's because most of these methods have only a single common implementation in the Enumerable
mixin.
Smalltalk is an exception, but there is another problem: the collection operations are duplicated for every single collection type. Every collection type has its own nearly-indetical practically copy&paste implementation of collect:
, select:
etc. This code duplication is hard to maintain and places a big burden on anyone who wants to integrate their own collection into the framework. In Ruby, it's easy: implement each
, mixin Enumerable
and you're done.
Note: as of Ruby 1.9, there is actually some of that duplication: Hash
implements its own version of select
which does actually return a Hash
and not an Array
. So, now, not only is there code duplication but also an asymmetry in the interface: all implementations of select
return Array
s except for the one in Hash
.
The Scala 2.8 collection framework is the first time ever that someone has figured out how to provide type-preserving collection operations without code duplication. But Ruby's collection framework was designed 15 years before Scala 2.8, so it cannot take advantage of that knowledge.
In Ruby 2.0, there are lazy Enumerator
s, where all collection operations return another lazy Enumerator
. But that won't help you here: the only difference is that the lazy Enumerator
will delay the 10 million iterations until you actually print
the values. It still has to perform those 10 million iterations because there is simply no way to do otherwise.
If you want slicing, you need a sliceable data structure, such as an Array
.
Easily create an Enumerator
Whereas attr_
methods create instance methods newly, your make_enum
modifies an existing method, which is rather similar to protected
, private
, and public
methods. Note that these visibility methods are used either in the form:
protected
def foo; ... end
or
protected def foo; ... end
or
def foo; ... end
protected :foo
The latter two ways are already available with your make_enum
. Especially, the second form is already possible (which Stefan also notes in the comment). You can do:
make_enum def test; ... end
If you want to do the first form, you should try to implement that in your make_enum
definition.
Chaining enumerators that yield multiple arguments
From the discourse so far, it follows that we can analyze the source code, but we do not know the whys. Ruby core team is relatively very responsive. I recommend you to sign in at http://bugs.ruby-lang.org/issues/ and post a bug report there. They will surely look at this issue at most within a few weeks, and you can probably expect it corrected in the next minor version of Ruby. (That is, unless there is a design rationale unknown to us to keep things as they are.)
Ruby: the yield inside of a block
The block is passed similarly to the argument of that function. This can be specified explicitly, like so:
class Test
def my_each(&block)
"abcdeabcabc".scan("a") do |x|
puts "!!! block"
yield x
# Could be replaced with: block.call(x)
end
end
end
Technically, it's exactly the same (puts
put in there for clarification), its presence is not checked the way it is usually done for arguments. Should you forget to give it a block, the function will halt on the first yield
it has to execute with exactly the same LocalJumpError
(at least, that's what I get on Rubinius). However, notice the "!!! block" in the console before it happens.
It works like that for a reason. You could check whether your function is given a block, if it is specified explicitly as above, using if block
, and then skip the yield
s. A good example of that is a content_tag
helper for Rails. Calls of this helper can be block-nested. A simplistic example:
content_tag :div do
content_tag :div
end
...to produce output like:
<div>
<div></div>
</div>
So, the block is executed "on top" (in terms of call stack) of your method. It is called each time a yield
happens as some sort of function call on a block. It's not accumulated anywhere to execute the block afterwards.
UPD:
The Enumerator
returned by many each
es is explicitly constructed by many iterators to save context of what should happen.
It could be implemented like this on my_each
:
class Test
def my_each(&block)
if block
"abcdeabcabc".scan("a") { |x| yield x }
else
Enumerator.new(self, :my_each)
end
end
end
Related Topics
Library Not Loaded: /Usr/Local/Opt/Readline/Lib/Libreadline.6.Dylib (Loaderror)
Rails Collection_Select VS. Select
Rails Console - Find Where Created at = Certain Day
Checking If a Method Is Defined on the Class
Add Comment to User and Post Models (Ruby on Rails)
Ruby: Insert Spaces Every X Number of Characters
How to Controller (Start/Kill) a Background Process (Server App) in Ruby
Spinach VS Cucumber for Bdd in Rails
Convert Unicode into Character with Ruby
Rails Flash Message Remains for Two Page Loads
How to Log the Entire Trace Back of a Ruby Exception Using the Default Rails Logger
Unit Test in Rails - Model with Paperclip
How to Solve the Update Bundler Warning in Rails When Deploying to Heroku