Ruby - Compare two Enumerators elegantly
Just a slight refactoring to your code, assuming that your streams do not include an element :eof
.
def equal_streams?(s1, s2)
loop do
e1 = s1.next rescue :eof
e2 = s2.next rescue :eof
return false unless e1 == e2
return true if e1 == :eof
end
end
Using a keyword like loop
should be faster than using a method like each
.
Best way to interleave two enums in ruby?
Here's a shorter and more general approach:
def combine_enums_with_ratio(ratios)
return enum_for(__method__, ratios) unless block_given?
counts = ratios.transform_values { |value| Rational(1, value) }
until counts.empty?
begin
enum, _ = counts.min_by(&:last)
yield enum.next
counts[enum] += Rational(1, ratios[enum])
rescue StopIteration
counts.delete(enum)
end
end
end
Instead of two enums, it takes a hash of enum => ratio
pairs.
At first, it creates a counts
hash using the ratio's reciprocal, i.e. enum_a => 3, enum_b => 2
becomes:
counts = { enum_a => 1/3r, enum_b => 1/2r }
Then, within a loop, it fetches the hash's minimum value, which is enum_a
in the above example. It yields its next
value and increment its counts
ratio value:
counts[enum_a] += 1/3r
counts #=> {:enum_a=>(2/3), :enum_b=>(1/2)}
On the next iteration, enum_b
has the smallest value, so its next
value will be yielded and its ratio be incremented:
counts[enum_b] += 1/2r
counts #=> {:enum_a=>(2/3), :enum_b=>(1/1)}
If you keep incrementing enum_a
by (1/3)
and enum_b
by (1/2)
, the yield ratio of their elements will be 3:2.
Finally, the rescue
clause handles enums running out of elements. If this happens, that enum is removed from the counts
hash.
Once the counts
hash is empty, the loop stops.
Example usage with 3 enums:
enum_a = (1..10).each
enum_b = ('a'..'f').each
enum_c = %i[foo bar baz].each
combine_enums_with_ratio(enum_a => 3, enum_b => 2, enum_c => 1).to_a
#=> [1, "a", 2, 3, "b", :foo, 4, "c", 5, 6, "d", :bar, 7, "e", 8, 9, "f", :baz, 10]
# <---------------------> <---------------------> <--------------------->
# 3:2:1 3:2:1 3:2:1
Built in way to concatenate two Enumerators
You can define a new enumerator, iterating through your existing enumerators. Something like:
enum = Enumerator.new { |y|
enum1.each { |e| y << e }
enum2.each { |e| y << e }
}
Fibers vs. explicit enumerators
I would use Enumerator
, it allows you to use take
, take_while
, even each
if your sequence is finite. While Fiber
is designed for light weight concurrency and is pretty limited as enumerator.
prime_enum.take(ARGV[0].to_i).each { |x| puts x }
or
prime_enum.take_while { |x| x < ARGV[0].to_i }.each { |x| puts x }
Mapping enumerators
The only way I know of doing this is to do the following:
a = [[1, 2, 3], [4, 5, 6]]
a.map { |b| b.map(&:succ) } # => [[2, 3, 4], [5, 6, 7]]
Mainly because of the combination of Array#map
/Enumerable#map
and Symbol#to_proc
, you cannot pass a second variable to the block that #map
yields for, and thus pass another variable to the inner #map
:
a.map(1) { |b, c| c } # c => 1, but this doesn't work :(
So you have to use the block syntax; Symbol#to_proc
actually returns a proc that takes any number of arguments (you can test this by doing :succ.to_proc.arity
, which returns -1
). The first argument is used as the receiver, and the next few arguments are used as arguments to the method - this is demonstrated in [1, 2, 3].inject(&:+)
. However,
:map.to_proc.call([[1, 2, 3], [4, 5, 6]], &:size) #=> [3, 3]
How? :map.to_proc
creates this:
:map.to_proc # => proc { |receiver, *args, &block| receiver.send(:map, *args, &block) }
This is then called with the array of arrays as an argument, with this block:
:size.to_proc # => proc { |receiver, *args, &block| receiver.send(:size, *args, &block) }
This results in .map { |receiver| receiver.size }
being effectively called.
This all leads to this - since #map
doesn't take extra arguments, and passes them to the block as parameters, you have to use a block.
How can I make a ruby enumerator that does lazy iteration through two other enumerators?
This seems to work just how I want;
enums.lazy.flat_map{|enum| enum.lazy }
Here's the demonstration. Define these yielding methods with side-effects;
def test_enum
return enum_for __method__ unless block_given?
puts 'hi'
yield 1
puts 'hi again'
yield 2
end
def test_enum2
return enum_for __method__ unless block_given?
puts :a
yield :a
puts :b
yield :b
end
concated_enum = [test_enum, test_enum2].lazy.flat_map{|en| en.lazy }
Then call next on the result, showing that the side effects happen lazily;
[5] pry(main)> concated_enum.next
hi
=> 1
[6] pry(main)> concated_enum.next
hi again
=> 2
Ruby - Optimize the comparison of two arrays with duplicates
A similar question was posted a few weeks ago and I got the accepted answer with something like:
def is_subset?(a,b)
!a.find{|x| a.count(x) > b.count(x)}
end
Benchmark Update
require 'benchmark'
def random_char
('a'..'z').to_a.sample
end
A = 8.times.map{random_char}
B = 8.times.map{random_char}
def ary_subset?(a,b) # true iff a is subset of b
a_counts = a.reduce(Hash.new(0)) { |m,v| m[v] += 1; m }
b_counts = b.reduce(Hash.new(0)) { |m,v| m[v] += 1; m }
a_counts.all? { |a_key,a_ct| a_ct <= b_counts[a_key] }
end
Benchmark.bm do |x|
x.report('me') {100000.times{is_subset?(A,B)}}
x.report('dbenhur'){100000.times{ary_subset?(A,B)}}
end
user system total real
me 0.375000 0.000000 0.375000 ( 0.384022)
dbenhur 2.558000 0.000000 2.558000 ( 2.550146)
ruby enumerators: immediately skip multiple iterations (or start iterating from n)
I first construct a helper method, change_base
, with three arguments:
off
, the base-10 offset into the sequence of repeated permutations of the given arrayarr
,m
, a number system base; andp
, the permutation size.
The method performs three steps to construct an array off_m
:
- converts
off
to basem
(radixm
); - separates the digits of the base
m
value into an array; and - if necessary, pads the array with leading
0
s to make it of sizep
.
By setting m = arr.size
, each digit of off_m
is an offset into arr
, so off_m
maps the base-10 offset to a unique permutation of size p
.
def change_base(m, p, off)
arr = off.to_s(m).chars.map { |c| c.to_i(m) }
arr.unshift(*[0]*(p-arr.size))
end
Some examples:
change_base(16, 2, 32)
#=> [2, 0]
change_base(16, 3, 255)
#=> [0, 15, 15]
change_base(36, 4, 859243)
#=> [18, 14, 35, 31]
18*36**3 + 14*36**2 + 35*36**1 + 31
#=> 859243
This implementation of change_base
requires that m <= 36
. I assume that will be sufficient, but algorithms are available to convert base-10 numbers to numbers with arbitrarily-large bases.
We now construct a method which accepts the given array, arr
, the size of each permutation, p
and a given base-10 offset into the sequence of permutations. The method returns a permutation, namely, an array of size p
whose elements are elements of arr
.
def offset_to_perm(arr, p, off)
arr.values_at(*change_base(arr.size, p, off))
end
We can now try this with an example.
arr = (0..3).to_a
p = 2
(arr.size**p).times do |off|
print "perm for off = "
print " " if off < 10
print "#{off}: "
p offset_to_perm(arr, p, off)
end
perm for off = 0: [0, 0]
perm for off = 1: [0, 1]
perm for off = 2: [0, 2]
perm for off = 3: [0, 3]
perm for off = 4: [0, 1]
perm for off = 5: [1, 1]
perm for off = 6: [2, 1]
perm for off = 7: [3, 1]
perm for off = 8: [0, 2]
perm for off = 9: [1, 2]
perm for off = 10: [2, 2]
perm for off = 11: [3, 2]
perm for off = 12: [0, 3]
perm for off = 13: [1, 3]
perm for off = 14: [2, 3]
perm for off = 15: [3, 3]
If we wish to begin at, say, offset 5, we can write:
i = 5
p offset_to_perm(arr, p, i)
[1, 1]
i = i.next #=> 6
p offset_to_perm(arr, p, i)
[2, 1]
...
Does Ruby's Enumerable#zip create arrays internally?
I'll be using 1.9.2-p0 as that's what I have on hand.
The rb_check_array_type
function looks like this:
VALUE
rb_check_array_type(VALUE ary)
{
return rb_check_convert_type(ary, T_ARRAY, "Array", "to_ary");
}
And rb_check_convert_type
looks like this:
VALUE
rb_check_convert_type(VALUE val, int type, const char *tname, const char *method)
{
VALUE v;
/* always convert T_DATA */
if (TYPE(val) == type && type != T_DATA) return val;
v = convert_type(val, tname, method, FALSE);
if (NIL_P(v)) return Qnil;
if (TYPE(v) != type) {
const char *cname = rb_obj_classname(val);
rb_raise(rb_eTypeError, "can't convert %s to %s (%s#%s gives %s)",
cname, tname, cname, method, rb_obj_classname(v));
}
return v;
}
Note the convert_type
call. This looks a lot like C version of Array.try_convert
and try_convert
just happens to look like this:
/*
* call-seq:
* Array.try_convert(obj) -> array or nil
*
* Try to convert <i>obj</i> into an array, using +to_ary+ method.
* Returns converted array or +nil+ if <i>obj</i> cannot be converted
* for any reason. This method can be used to check if an argument is an
* array.
*
* Array.try_convert([1]) #=> [1]
* Array.try_convert("1") #=> nil
*
* if tmp = Array.try_convert(arg)
* # the argument is an array
* elsif tmp = String.try_convert(arg)
* # the argument is a string
* end
*
*/
static VALUE
rb_ary_s_try_convert(VALUE dummy, VALUE ary)
{
return rb_check_array_type(ary);
}
So, yes, the first loop is looking for anything in argv
that is not an array and setting the allary
flag if it finds such a thing.
In enum.c
, we see this:
id_each = rb_intern("each");
So id_each
is an internal reference for the Ruby each
iterator method. And in vm_eval.c
, we have this:
/*!
* Calls a method
* \param recv receiver of the method
* \param mid an ID that represents the name of the method
* \param n the number of arguments
* \param ... arbitrary number of method arguments
*
* \pre each of arguments after \a n must be a VALUE.
*/
VALUE
rb_funcall(VALUE recv, ID mid, int n, ...)
So this:
argv[i] = rb_funcall(argv[i], conv, 1, ID2SYM(id_each));
Is calling to_enum
(with, essentially, the default argument) on whatever is in argv[i]
.
So, the end result of the first for
and if
blocks is that argv
is either full of arrays or full of enumerators rather than possibly being a mix of the two. But note how the logic works: if something is found that isn't an array, then everything becomes an enumerator. The first part of the enum_zip
function will wrap arrays in enumerators (which is essentially free or at least cheap enough not to worry about) but won't expand enumerators into arrays (which could be quite expensive). Earlier versions might have gone the other way (prefer arrays over enumerators), I'll leave that as an exercise for the reader or historians.
The next part:
if (!rb_block_given_p()) {
result = rb_ary_new();
}
Creates a new empty array and leaves it in result
if zip
is being called without a block. And here we should note what zip
returns:
enum.zip(arg, ...) → an_array_of_array
enum.zip(arg, ...) {|arr| block } → nil
If there is a block, then there is nothing to return and result
can stay as Qnil
; if there isn't a block, then we need an array in result
so that an array can be returned.
From parse.c
, we see that NODE_DOT2
is a double-dot range but it looks like they're just using the new node as a simple three element struct; rb_new_node
just allocates an object, sets some bits, and assigns three values in a struct:
NODE*
rb_node_newnode(enum node_type type, VALUE a0, VALUE a1, VALUE a2)
{
NODE *n = (NODE*)rb_newobj();
n->flags |= T_NODE;
nd_set_type(n, type);
n->u1.value = a0;
n->u2.value = a1;
n->u3.value = a2;
return n;
}
nd_set_type
is just a bit fiddling macro. Now we have memo
as just a three element struct. This use of NODE_DOT2
appears to be a convenient kludge.
The rb_block_call
function appears to be the core internal iterator. And we see our friend id_each
again so we'll be doing an each
iteration. Then we see a choice between zip_i
and zip_ary
; this is where the inner arrays are created and pushed onto result
. The only difference between zip_i
and zip_ary
appears to be the StopIteration exception handling in zip_i
.
At this point we've done the zipping and we either have the array of arrays in result
(if there was no block) or we have Qnil
in result
(if there was a block).
Executive Summary: The first loop explicitly avoids expanding enumerators into arrays. The zip_i
and zip_ary
calls will only work with non-temporary arrays if they have to build an array of arrays as a return value. So, if you call zip
with at least one non-array enumerator and use the block form, then it is enumerators all the way down and the "problem with zip is that it creates arrays internally" does not happen. Reviewing 1.8 or other Ruby implementations is left as an exercise for the reader.
Loop through array and compare consecutive elements in Ruby
Here is a "direct" translation of the python code to ruby - note that the syntax is almost identical! (The only subtle differences are elif
vs elsif
and the print
/puts
syntax.)
amounts = [10, 9, 10, 3, 100, 100, 90, 80, 10, 30, 10]
for i in 0..(amounts.count)-2 do
if amounts[i+1] > amounts[i]
puts "up #{amounts[i+1]-amounts[i]}"
elsif amounts[i+1] < amounts[i]
puts "down #{amounts[i]-amounts[i+1]}"
else
puts "stay"
end
end
There was nothing wrong with using Array#at
, but it's just an alias for Array#[]
. So if your question is about "translating" the code, then I don't see the point in making changes-for-the-sake-of-changes.
Note that I also fixed the off-by-one error that's present in both versions of your code -- you can only run this up to len(amounts)-2
, not len(amounts)-1
, or you'll go out of bounds!
However, this is not really written in "the ruby way". for
loops are rarely used in ruby, because the language has other highly expressive iterators. Here is how I would have written it, using Enumerable#each_cons
:
amounts = [10, 9, 10, 3, 100, 100, 90, 80, 10, 30, 10]
amounts.each_cons(2) do |first, second|
if first < second
puts "up #{second - first}"
elsif first > second
puts "down #{first - second}"
else
puts "stay"
end
end
# Output:
down 1
up 1
down 7
up 97
stay
down 10
down 10
down 70
up 20
down 20
One advantage to this syntax is that off-by-one errors (like you had in the python!) aren't really possible, because you're just saying "loop over all elements" without having to worry about tracking the indexes yourself.
This is actually a really good example for appreciating the elegance of ruby, if code is written in "the ruby way" rather than merely a direct translation.
Related Topics
Testing Rake Tasks with Rspec Is Not Accepting Arguments
How to Create All Possible Combinations with Two Arrays
Actioncontroller::Routingerror (No Route Matches [Get] "/"):
Finding the Date for a Given Week Number
Ruby Defining Operator Procedure
Ruby's Argv Can Be Empty on Windows Depending on a Way to Run Script
What Regex How to Use to Get the Domain Name from a Url in Ruby
Rails Form Object with Reform-Rails with Collections Not Working or Validating
How to Specify a Regex Character Range That Will Work in European Languages Other Than English
How to Remove a Row from a CSV with Ruby
Read and Write File Atomically
Should I Be Using Rails or Ruby for This Website Application? How