Ruby - How to Retrieve Sum in Array Group by Multiple Keys with Condition Max

Ruby - How to retrieve sum in array group by multiple keys with condition max

One way of doing this is to use the form of Hash#update (aka merge!) that uses a block to determine the values of keys that are present in both hashes being merged.

Code

def f_addition(arr, group_fields, sum_fields, max_fields)
arr.each_with_object({}) do |h,g|
g.update( h.values_at(*group_fields) => h ) do |_,gv,hv|
gv.merge(hv) do |k,gvv,hvv|
case
when sum_fields.include?(k) then "%.2f" % (gvv.to_f + hvv.to_f)
when max_fields.include?(k) then [gvv, hvv].max
else gvv
end
end
end
end.values
end

Example

arr = [{ "id"=>2, "idx"=>111, "money"=>"4.00", "money1"=>"1.00",
"order"=>"001", "order1"=>"1", "pet"=>"dog" },
{ "id"=>1, "idx"=>112, "money"=>"2.00", "money1"=>"2.00",
"order"=>"001", "order1"=>"1", "sport"=>"darts" },
{ "id"=>3, "idx"=>113, "money"=>"3.00", "money1"=>"1.00",
"order"=>"002", "order1"=>"2" }]

Notice that the this array is slightly different from from the one given in the question. I have added "pet"=>"dog" to the first (hash) element "sport"=>"darts"and to the second hash.

f_addition(arr, ["order","order1"], ["money","money1"], ["id", "idx"] )
#=> [{ "id"=>2, "idx"=>112, "money"=>"6.00", "money1"=>"3.00",
# "order"=>"001", "order1"=>"1", "pet"=>"dog", "sport"=>"darts"},
# { "id"=>3, "idx"=>113, "money"=>"3.00", "money1"=>"1.00",
# "order"=>"002", "order1"=>"2" }]

Explanation

For the example above:

group_fields = ["order", "order1"]
sum_fields = ["money", "money1"]
max_fields = ["id", "idx"]

enum = arr.each_with_object({})
#=> #<Enumerator: [{"id"=>2, "idx"=>111,..., "pet"=>"dog"},
# {"id"=>1, "idx"=>112,..., "sport"=>"darts"},
# {"id"=>3,"idx"=>113,...,"order1"=>"2"}]:each_with_object({})>

Array#each passes each element of this enumerator into the block and assigns it to the block variables. The first element passed is:

h, g = enum.next
#=> [{ "id"=>2, "idx"=>111, "money"=>"4.00", "money1"=>"1.00",
"order"=>"001", "order1"=>"1", "pet"=>"dog" }, {}]
h #=> { "id"=>2, "idx"=>111, "money"=>"4.00", "money1"=>"1.00",
"order"=>"001", "order1"=>"1", "pet"=>"dog" }
g #=> {}

As:

h.values_at(*group_fields)
#=> h.values_at(*["order", "order1"])
#=> h.values_at("order", "order1")
#=> ["001", "1"]

we compute:

g.update(["001", "1"] => h) do |k,gv,hv| ... end

which is shorthand for:

g.update({ ["001", "1"] => h }) do |k,gv,hv| ... end

The block do |k,gv,hv| ... end is only used when the two hashes being merged both contain the key k.1 As g = {} contains no keys, the block is not used at this time:

g.update({ ["001", "1"] => h })
#=> {}.update({ ["001", "1"]=>{ "id"=>2, "idx"=>111, "money"=>"4.00",
# "money1"=>"1.00", "order"=>"001",
# "order1"=>"1", "pet"=>"dog" } }
#=> { ["001", "1"]=>{ "id"=>2, "idx"=>111, "money"=>"4.00", "money1"=>"1.00",
# "order"=>"001", "order1"=>"1", "pet"=>"dog" } }

where the value returned by update is the new value of g.

The next value of enum passed into the block is:

h, g = enum.next
h #=> { "id"=>1, "idx"=>112, "money"=>"2.00", "money1"=>"2.00",
# "order"=>"001", "order1"=>"1", "sport"=>"darts" },
g #=> { ["001", "1"]=>{ "id"=>2, "idx"=>111, "money"=>"4.00", "money1"=>"1.00",
# "order"=>"001", "order1"=>"1", "pet"=>"dog" } }]

As:

h.values_at(*group_fields)
#=> h.values_at("order", "order1")
#=> ["001", "1"]

we compute:

g.update(["001", "1"] => h) do |k,gv,hv| ... end

As g and { ["001", "1"] => h } both contain the key ["001", "1"], we must defer to the block to determine the value of that key in the merged hash. We have:

k  = ["001", "1"]
gv = { "id"=>2, "idx"=>111, "money"=>"4.00", "money1"=>"1.00",
"order"=>"001", "order1"=>"1", "pet"=>"dog" }
hv = { "id"=>1, "idx"=>112, "money"=>"2.00", "money1"=>"2.00",
"order"=>"001", "order1"=>"1", "sport"=>"darts" }

We therefore evaluate the block as follows (using merge rather than merge!/update):

gv.merge(hv) do |k,gvv,hvv|
case
when sum_fields.include?(k) then "%.2f" % (gvv.to_f + hvv.to_f)
when max_fields.include?(k) then [gvv, hvv].max
else gvv
end
end
#=> { "id"=>2, "idx"=>112, "money"=>"6.00", "money1"=>"3.00",
# "order"=>"001", "order1"=>"1", "pet"=>"dog", "sport"=>"darts"}

gv does not contain the key "sport", so the block is not used when merging "sport"=>"darts" into gv. All other keys of hvv are present in gvv, however, so we use the block to determine their values in the merged hash. For:

k = "money"
gvv = "4.00"
hvv = "2.00"

we find:

sum_fields.include?(k)
#=> ["money", "money1"].include?("money")
#=> true

so the case statement returns:

"%.2f" % (gvv.to_f + hvv.to_f)
#=> "%.2f" % ("4.00".to_f + "2.00".to_f)
#=> "6.00"

The values for other keys of hv, the hash being merged into gv, are computed similarly, to give us a new value for the merged hash g.

Lastly,

{ ["002", "order1"] => { "id"=>3, "idx"=>113, "money"=>"3.00",
"money1"=>"1.00", "order"=>"002", "order1"=>"2" }]

is merged into g (which does not require the use update's block) and g.values is returned by the method.

Observation

It would be easy to generalize this to pass pairs such as:

[["money","money1"], ->(a,b) { "%.2f" % (a.to_f + b.to_f) }]
[["id", "idx"], :max]

This could be done as follows:

def f_addition(arr, group_fields, *mods)
arr.each_with_object({}) do |h,g|
g.update( h.values_at(*group_fields) => h ) do |_,gv,hv|
gv.merge(hv) do |k,gvv,hvv|
f,op = mods.find { |f,op| f.include?(k) }
if f
case op
when Proc then op.call(gvv,hvv)
when Symbol then [gvv, hvv].send(op)
end
else
gvv
end
end
end
end.values
end

f_addition(arr, ["order","order1"],
[["money","money1"], ->(a,b) { "%.2f" % (a.to_f + b.to_f) }],
[["id", "idx"], :max])
# => [{ "id"=>2, "idx"=>112, "money"=>"6.00", "money1"=>"3.00",
# "order"=>"001", "order1"=>"1", "pet"=>"dog", "sport"=>"darts" },
# { "id"=>3, "idx"=>113, "money"=>"3.00", "money1"=>"1.00",
# "order"=>"002", "order1"=>"2" }]
1. We will find that the calculations in the block do not depend on the block variable `k`.
I've therefore replaced that variable with the local variable _, to so-inform the reader.

Ruby on Rails - Hash of Arrays, group by and sum by with many columns

Usnig Enumerable#group_by, you can iterate arrays of hashes grouped by order, order1 key.

Then merge hashes (by summing up money, money1 entries):

a = [
{"idx"=>"1234", "account"=>"abde", "money"=>"4.00", "money1"=>"1.00", "order"=>"00001", "order1"=>"1"},
{"idx"=>"1235", "account"=>"abde", "money"=>"2.00", "money1"=>"1.00", "order"=>"00001", "order1"=>"1"},
{"idx"=>"1235", "account"=>"abde", "money"=>"3.00", "money1"=>"1.00", "order"=>"00002", "order1"=>"2"}
]
a.group_by { |x| x.values_at('order', 'order1') }.map {|key, hashes|
result = hashes[0].clone
['money', 'money1'].each { |key|
result[key] = hashes.inject(0) { |s, x| s + x[key].to_f }
}
result
}
# => [{"idx"=>"1234", "account"=>"abde", "money"=>6.0, "money1"=>2.0, "order"=>"00001", "order1"=>"1"},
# {"idx"=>"1235", "account"=>"abde", "money"=>3.0, "money1"=>1.0, "order"=>"00002", "order1"=>"2"}]

group_keys = ['order', 'order1']
sum_keys = ['money', 'money1']
a.group_by { |x| x.values_at(*group_keys) }.map {|key, hashes|
result = hashes[0].clone
sum_keys.each { |key|
result[key] = hashes.inject(0) { |s, x| s + x[key].to_f }
}
result
}

Ruby: Select a grouped array with multiple conditions

There are a lot of good answers here. I'd like to add that you can eliminate a lot of iteration by combining operations.

For example, rather than calculating the sums for each group in a second step, you can do that inside your group_by block:

sums = Hash.new(0)

groups = transactions.group_by do |t|
sums[t["name"]] += t["amount"]
t["name"]
end

p groups
# => { "CAR" => [ { "amount" => -3000, "name" => "CAR" } ],
# "BOAT" => [ ... ],
# "HOUSE" => [ ... ] }

p sums
# => { "CAR" => -3000, "BOAT" => -1800, "HOUSE" => -500 }

Next instead of doing groups.select to eliminate groups with only one member and then min_by to get the final result, combine the former into the latter:

result = groups.min_by do |k,g|
g.size > 1 ? sums[k] : Float::INFINITY
end

p result[1]
# => [ { "amount" => -600, "name" => "BOAT" },
# { "amount" => -600, "name" => "BOAT" },
# { "amount" => -600, "name" => "BOAT" } ]

Because everything is smaller than Float::INFINITY, those groups with only one member will never be selected (unless every group has only one member).

And so...

Solution 1

Putting it all together:

sums = Hash.new(0)

result =
transactions.group_by {|t|
sums[t["name"]] += t["amount"]
t["name"]
}.min_by {|k,g| g.size > 1 ? sums[k] : Float::INFINITY }[1]

p result
# => [ { "amount" => -600, "name" => "BOAT" },
# { "amount" => -600, "name" => "BOAT" },
# { "amount" => -600, "name" => "BOAT" } ]

Solution 2

You could also combine all of this into a single reduce and iterate over the data only once, but it's not very Rubyish:

sums = Hash.new(0)
groups = Hash.new {|h,k| h[k] = [] }
min_sum = Float::INFINITY

result = transactions.reduce do |min_group, t|
name = t["name"]
sum = sums[name] += t["amount"]
(group = groups[name]) << t

if group.size > 1 && sum < min_sum
min_sum, min_group = sum, group
end
min_group
end

Note that you could move all of those outside variable declarations into, say, an array passed to reduce (instead of nil), but it would impact readability a lot.

Rails group and sum array of objects

First of all, your nested map followed by flatten(1) can be simplified by making the first map into flat_map. If you do this you could remove the flatten(1).

From this point your code is most of the way there, but you could make the following changes to get the desired output:

  1. you can group by multiple attributes, name and id. In another language you might use a tuple for this. Ruby doesn't have tuples, so we can just use a len-2 array:

    .group_by { |p| [p[:item_id], p[:item_name]] }
    .transform_values { |vals| vals.sum { |val| val[:quantity] } }
  2. At this point you have a hash mapping [id,name] tuple to quantity:

    { [1,"foo"] => 123, [2, "bar"] => 456 }

    and you can coerce this to the desired data type using reduce (or each_with_object, if you prefer):

    .reduce([]) do |memo, ((id, name), quantity)|
    memo << {
    id: id,
    name: name,
    quantity: quantity
    }
    end

The wierd looking ((id, name), quantity) is a kind of destructuring. See https://jsarbada.wordpress.com/2019/02/05/destructuring-with-ruby/ specifically the sections on "Destructuring Block Arguments" and "Destructuring Hashes".

Group an array of hashes by two conditions with Ruby

In order to group them by conversation you could add the sort method to the group_by function:

.group_by{ |r| r.values_at('recipient_id', 'sender_id').sort }

Ruby on Rails - Get array of values from array of hash with particular order of existing keys

Map all the records and then map the attributes in the order given to return the attributes' values in the specified order.

records = [
{:age=>28, :name=>"John", :id=>1},
{:name=>"David", :age=>20, :id=>2, :sex=>"male"}
]

attributes = [:id, :name, :age]
records.map do |record|
attributes.map { |attr| record[attr] }
end

Ruby select top 3 groups by amount

You can build a hash of the types (names), and sum the values as you go along with:

@transactions.each_with_object(Hash.new(0)) do |obj, hash|
hash[obj["name"]] += obj["amount"].abs
end

Then you can add some moar magic to the end of that, or break it up into more lines (recommended for readability):

@transactions.each_with_object(Hash.new(0)) do |obj, hash|
hash[obj["name"]] += obj["amount"].abs
end.sort_by(&:last).map(&:first).last(3).reverse

Basically, that's sorting by values (which turns your new hash to an array of tuples), then mapping the first value of each tuple (the name), then taking the top 3.

Edit

I didn't notice the negatives, so I summed while taking the absolute value of the amounts, then the sort_by sorts from smallest to largest, so take the last three and reverse to give you largest to smallest order.

It's a little complicated in a small block like that, I'd suggest breaking it up.

Split array of subarrays by subarrays sum size Ruby

You can use reduce method and keep pushing sub arrays to a new array. Consider the following:

new_arr = arr.reduce([]) do |acc, sub_array|
last_element = acc[acc.length - 1]

if last_element.nil? or (last_element + sub_array).length > 6
acc << sub_array
else
acc[acc.length - 1] = last_element + sub_array
end
acc
end

# Tests
new_arr.flatten.size == arr.flatten.size # test total number of elements in both the arrays
new_arr.map(&:size) # the sizes of all sub arrays
new_arr.map(&:size).min # min size of all sub arrays
new_arr.map(&:size).max # max size of all sub arrays

Let me know if the code is not clear to you

Update:

Reduce method will "reduce" any enumerable object to a single value by iterating through every element of the enumerable just like each, map

Consider an example:

# Find the sum of array
arr = [1, 2, 3]

# Reduce will accept an initial value & a block with two arguments
# initial_value: is used to set the value of the accumulator in the first loop

# Block Arguments:
# accumulator: accumulates data through the loop and finally returned by :reduce
# value: each item of the above array in every loop(just like :each)

arr.reduce(0) do |acc, value|
# initial value is 0; in the first loop acc's value will be set to 0
# henceforth acc's value will be what is returned from the block in every loop

acc += value
acc # acc is begin returned; in the second loop the value of acc will be (0 + 1)
end

So in this case in every loop, we add the value of the item to the accumulator and return the accumulator for use in the next loop. And once reduce has iterated all the items in the array it will return the accumulator.

Ruby also provides syntactic sugar to make it look much fancier:

arr.reduce(:+) # return 6

Here's a good article for further reference

So if you take your question for example:

# Initial value is set to an empty array, what we're passing to reduce
new_arr = arr.reduce([]) do |acc, sub_array|
# In the first loop acc's value will be set to []

# we're finding the last element of acc (in first loop since the array is empty
# last element will be nil)
last_element = acc[acc.length - 1]

# If last_element is nil(in first loop) we push the first item of the array to acc
# If last_element is found(pushed in the previous loops), we take it and sum
# it with the item from the current loop and see the size, if size is more
# than 6, we only push the item from current loop
if last_element.nil? or (last_element + sub_array).length > 6
acc << sub_array
else
# If last element is present & last_element + item from current loop's size
# is less than 6, we push the (last_element + item from current loop) into
# the accumulator.
acc[acc.length - 1] = last_element + sub_array
end

# Finally we return the accumulator, which will be used in the next loop
# Or if has looped through the entire array, it will be used to return back
# from where it was called
acc
end

Sum on multiple columns with Activerecord

You can use raw SQL if you need to. Something like this to return an object where you'll have to extract the values... I know you specify active record!

Student.select("SUM(students.total_mark) AS total_mark, SUM(students.marks_obtained) AS marks obtained").where(:id=>student_id)

For rails 4.2 (earlier unchecked)

Student.select("SUM(students.total_mark) AS total_mark, SUM(students.marks_obtained) AS marks obtained").where(:id=>student_id)[0]

NB the brackets following the statement. Without it the statement returns an Class::ActiveRecord_Relation, not the AR instance. What's significant about this is that you CANNOT use first on the relation.

....where(:id=>student_id).first #=> PG::GroupingError: ERROR:  column "students.id" must appear in the GROUP BY clause or be used in an aggregate function


Related Topics



Leave a reply



Submit