More Efficient Ruby Way to Map Attribute in Array of Objects to Another Array

More efficient Ruby way to map attribute in array of objects to another array?

Use the map method:

Returns a new array with the results of running block once for every element in enum.

def recruits_names
self.referrals.map { |r| r.display_name }
end

[Update] As indicated by Staelen in the comments, this example can be shortened even further to:

def recruits_names
self.referrals.map(&:display_name)
end

For the curious, this is because & calls to_proc on the object following it (when used in a method call), and Symbol implements to_proc to return a Proc that executes the method indicated by the symbol on each value yielded to the block (see the documentation).

How to use one array as a map for another array in ruby and then join them?

As per your question,

arr1.map{|a1| a1.keys.map{|a1k| arr2[0][a1k]*a1[a1k]}.inject(:+)}

If arr2 is just a simple hash:

arr1.map{|a1| a1.keys.map{|a1k| arr2[a1k]*a1[a1k]}.inject(:+)}

Transfer array of objects to array of only one attribute in ruby

This is very straightforward with #map once we correct your syntax in defining arr a bit.

arr = [       
{
name: 'hihihihi',
phone: 'null',
email: 'test@test.com',
position: 'null'
},
{
name: 'hi',
phone: 'null',
email: 'test2@test.com',
position: 'null'
}
]

arr.map { |h| h[:email] }

Ruby: Link two arrays of objects by attribute value

If you want to take an adventure through Enumerable, you could say this:

(a.map { |h| [:a, h] } + b.map { |h| [:b, h] })
.group_by { |_, h| h[:id] }
.select { |_, a| a.length == 2 }
.inject({}) { |h, (n, v)| h.update(n => Hash[v]) }

And if you really want the keys to be strings, say n.to_s => Hash[v] instead of n => Hash[v].

The logic works like this:

  1. We need to know where everything comes from we decorate the little hashes with :a and :b symbols to track their origins.
  2. Then add the decorated arrays together into one list so that...
  3. group_by can group things into almost-the-final-format.
  4. Then find the groups of size two since those groups contain the entries that appeared in both a and b. Groups of size one only appeared in one of a or b so we throw those away.
  5. Then a little injection to rearrange things into their final format. Note that the arrays we built in (1) just somehow happen to be in the format that Hash[] is looking for.

If you wanted to do this in a link method then you'd need to say things like:

link :a => a, :b => b

so that the method will know what to call a and b. This hypothetical link method also easily generalizes to more arrays:

def link(input)
input.map { |k, v| v.map { |h| [k, h] } }
.inject(:+)
.group_by { |_, h| h[:id] }
.select { |_, a| a.length == input.length }
.inject({}) { |h, (n, v)| h.update(n => Hash[v]) }
end

link :a => [...], :b => [...], :c => [...]

How can I map values in two different arrays as properties on an equivalent array of objects?

a.zip(b).map { |args| Obj.new(*args) }

Per your edit:

a.zip(b).map { |(a, b)| Obj.new(a, b) }

iterating over an array of objects and adding to an earlier element...is any more efficient than iterating over a small portion?

stddev is sqrt(variance). Population variance is the mean of the sum of squares of the population. You say you want the running stddev over sublists of 20 elements. So you could calculate this faster by starting by calculating the sum of the squares of the first 20 elements, then iterate through the remaining elements, subtracting the square of the n-20th element and adding the square of the new element and calculating sqrt(current_sum_of_squares/20.0) for the stddev. This will result in about a factor of 20 fewer computations as calculating the stddev independently over N-20 20-element sub-lists.

Pushing the stdev onto the n-20th element is trivial as it doesn't involve any major mutation to the big list, just an append to that one element.

I've gotta run to a meeting now or I'd show some code. Perhaps later tonight if this isn't clear.

Why is #map more efficient than #each?

In both of your examples, the second piece of code allocates 100 times as much memory as the first piece of code. It also performs approximately log_1.5(100) resizes of the array (assuming a standard textbook implementation of a dynamic array with a growth factor of 1.5). Resizing an array is expensive (allocating a new chunk of memory, then an O(n) copy of all elements into the new chunk of memory). More generally, garbage collectors hate mutation, they are much more efficient at collecting lots of small short-lived objects than keeping alive a few large long-lived objects.

In other words, in the first example, you are measuring Array#map and Array#select, respectively, whereas in the second example, you are not only measuring Array#each, but also Array#<< as well as array resizing and memory allocation. It is impossible to tell from the benchmarking results, which of those contributes how much. As Zed Shaw once put it: "If you want to measure something, then don't measure other shit".

But even if you fix that bug in your benchmark, generally speaking more specialized operations have more information available than general ones, so the more general operations can typically not be faster than the specialized ones.

In your specific example it may just be something very simple such as, you are using a Ruby implementation that is not very good at optimizing Ruby code (such as YARV, and unlike e.g. TruffleRuby) while at the same time have an optimized native implementation of Array#map and Array#select (again, take YARV as an example, which has C implementations for both of those, and is generally not capable of optimizing Ruby code very well).

And lastly, writing correct microbenchmarks is hard. Really, really, really hard. I encourage to read and understand this entire discussion thread on the mechanical-sympathy Mailing list: JMH vs Caliper: reference thread. While it is specifically about Java benchmarking (actually about JVM benchmarking), many of the arguments apply to any modern high-performance OO execution engine such as Rubinius, TruffleRuby, etc. and to a lesser extent also to YARV. Note that most of the discussion is about writing microbenchmark harnesses, not writing microbenchmarks per se, i.e. it is about writing frameworks that allow developers to write correct microbenchmarks without having to know about that stuff, but unfortunately, even with the best microbenchmark harnesses (and Ruby's Benchmark is actually not a very good one), you still need to have a very deep understanding of modern compilers, garbage collectors, execution engines, CPUs, hardware architectures, but also statistics.

Here is a good example of a failed benchmark that may not be obvious to the untrained benchmark writer: Why is printing “B” dramatically slower than printing “#”?.

Intitalizing Object with Array of objects from another class Ruby

Your problem is assigning a new value to @stars_array variable on each iteration. There are multiple ways to deal with it:

@stars_array = (0..99).map { |i| Star.new('unknown_star',i) }

By the way, there is a couple of design issues (just for your attention):

  1. Why variable is called stars_array, not just stars?

  2. Why would ever instance of Star class have some object named @star inside? Recursion? :) Seems like @name would be proper and more clear attribute's name.

  3. Don't miss indentation.


EDIT: About DB-mapping. Most common way - inherit both classes from ActiveRecord::Base, and create one-to-many relation from solar system to stars. Each class will have it's own table. Takes absolutely no efforts.

In Ruby, is there an Array method that combines 'select' and 'map'?

I usually use map and compact together along with my selection criteria as a postfix if. compact gets rid of the nils.

jruby-1.5.0 > [1,1,1,2,3,4].map{|n| n*3 if n==1}    
=> [3, 3, 3, nil, nil, nil]

jruby-1.5.0 > [1,1,1,2,3,4].map{|n| n*3 if n==1}.compact
=> [3, 3, 3]


Related Topics



Leave a reply



Submit