Pairwise Array Sum Aggregate Function

Pairwise array sum aggregate function?

General solutions for any number of arrays with any number of elements. Individual elements or the the whole array can be NULL, too:

Simpler in 9.4+ using WITH ORDINALITY

SELECT ARRAY (
SELECT sum(elem)
FROM tbl t
, unnest(t.arr) WITH ORDINALITY x(elem, rn)
GROUP BY rn
ORDER BY rn
);

See:

  • PostgreSQL unnest() with element number

Postgres 9.3+

This makes use of an implicit LATERAL JOIN

SELECT ARRAY (
SELECT sum(arr[rn])
FROM tbl t
, generate_subscripts(t.arr, 1) AS rn
GROUP BY rn
ORDER BY rn
);

See:

  • What is the difference between LATERAL JOIN and a subquery in PostgreSQL?

Postgres 9.1

SELECT ARRAY (
SELECT sum(arr[rn])
FROM (
SELECT arr, generate_subscripts(arr, 1) AS rn
FROM tbl t
) sub
GROUP BY rn
ORDER BY rn
);

The same works in later versions, but set-returning functions in the SELECT list are not standard SQL and were frowned upon by some. Should be OK since Postgres 10, though. See:

  • What is the expected behaviour for multiple set-returning functions in SELECT clause?

db<>fiddle here

Old sqlfiddle

Related:

  • Is there something like a zip() function in PostgreSQL that combines two arrays?

How can I sum elements of postrgresql arrays by position?

You should unnest the arrays with ordinality, calculate sums of elements in groups by sku and ordinality and finally aggregate the sums into arrays using ordinality in groups by sku:

select sku, array_agg(elem order by ordinality) as outputs
from (
select sku, ordinality, sum(elem) as elem
from jobs
cross join unnest(outputs) with ordinality as u(elem, ordinality)
group by 1, 2
) s
group by 1
order by 1

DbFiddle.

If you often need this functionality in various contexts, it may be reasonable to create a custom aggregate:

create or replace function sum_int_arrays(int[], int[])
returns int[] language sql immutable as $$
select array_agg(coalesce(a, 0)+ b)
from unnest($1, $2) as u(a, b)
$$;

create aggregate sum_int_array_agg(integer[]) (
sfunc = sum_int_arrays,
stype = int[]
);

select sku, sum_int_array_agg(outputs)
from jobs
group by 1
order by 1

DbFiddle.

Aggregate query , using array aggregate function but one of the value in separated column

You can use conditional aggregation using the FILTER clause for it:

demo:db<>fiddle

SELECT
component_code,
MAX(name) FILTER (WHERE language_code = 'en') as en_name,
ARRAY_AGG(name) FILTER (WHERE language_code != 'en') as all_name
FROM component_translation
GROUP BY component_code

Postgres: on conflict, summing two vectrors(arrays)

There are two problems with the expression:

array_agg(unnest(test.counters) + unnest([2,0,2,1]))
  • there is no + operator for arrays,
  • you cannot use set-valued expressions as an argument in an aggregate function.

You need to unnest both arrays in a single unnest() call placed in the from clause:

insert into test (name, counters) 
values ('Joe', array[2,0,2,1])
on conflict (name) do
update set
counters = (
select array_agg(e1 + e2)
from unnest(test.counters, excluded.counters) as u(e1, e2)
)

Also pay attention to the correct data syntax in values and the use of a special record excluded (find the relevant information in the documentation.)

Test it in db<>fiddle.

Summing arrays in conjunction with GROUP BY

The problem with you original query is that you're summing all elements, because GROUP BY id, nts is executed in outer query. Combining a CTE with LATERAL JOIN would do the trick:

WITH tmp AS (
SELECT
id,
date_trunc('hour', ts + '1 hour') nts,
sum(elem) AS counts
FROM
ts2
LEFT JOIN LATERAL unnest(counts) WITH ORDINALITY x(elem, rn) ON TRUE
GROUP BY
id, nts, rn
)
SELECT id, nts, array_agg(counts) FROM tmp GROUP BY id, nts

How to use aggregate function on column cotainining string values?

SELECT SUM( dividend ) || '/' || SUM( divisor )
AS FractionOfSums
, AVG( dividend / divisor )
AS AverageOfFractions
FROM
( SELECT CAST(substr(result, 1, position('/' in result)-1 ) AS int)
AS dividend
, CAST(substr(result, position('/' in result)+1 ) AS int)
AS divisor
FROM rows
) AS division

get pairwise sums of multiple columns in dataframe

You can use rowSums on a column subset.

As a data frame:

data.frame(ab = rowSums(x[c("a", "b")]), cd = rowSums(x[c("c", "d")]))
# ab cd
# 1 11 17
# 2 10 16
# 3 9 15
# 4 8 14
# 5 7 13

As a matrix:

cbind(ab = rowSums(x[1:2]), cd = rowSums(x[3:4]))

For a wider data frame, you can also use sapply over a list of column subsets.

sapply(list(1:2, 3:4), function(y) rowSums(x[y]))

For all pairwise column combinations:

y <- combn(ncol(x), 2L, function(y) rowSums(x[y]))
colnames(y) <- combn(names(x), 2L, paste, collapse = "")
y
# ab ac ad bc bd cd
# [1,] 11 13 16 12 15 17
# [2,] 10 13 15 11 13 16
# [3,] 9 13 14 10 11 15
# [4,] 8 13 13 9 9 14
# [5,] 7 13 12 8 7 13

Make PostgreSQL aggregate function sum() return NULL if at least one of the addendum is NULL

You can create your own aggregate with the expected behavior :

CREATE OR REPLACE FUNCTION sum_null(x anyelement, y anyelement)
RETURNS anyelement LANGUAGE sql AS $$
SELECT x + y ; $$ ;

CREATE OR REPLACE AGGREGATE sum_null(x anyelement)
( SFUNC = sum_null, STYPE = anyelement, INITCOND = 0) ;

See the test result in dbfiddle

Element-wise addition of 2 lists?

Use map with operator.add:

>>> from operator import add
>>> list( map(add, list1, list2) )
[5, 7, 9]

or zip with a list comprehension:

>>> [sum(x) for x in zip(list1, list2)]
[5, 7, 9]

Timing comparisons:

>>> list2 = [4, 5, 6]*10**5
>>> list1 = [1, 2, 3]*10**5
>>> %timeit from operator import add;map(add, list1, list2)
10 loops, best of 3: 44.6 ms per loop
>>> %timeit from itertools import izip; [a + b for a, b in izip(list1, list2)]
10 loops, best of 3: 71 ms per loop
>>> %timeit [a + b for a, b in zip(list1, list2)]
10 loops, best of 3: 112 ms per loop
>>> %timeit from itertools import izip;[sum(x) for x in izip(list1, list2)]
1 loops, best of 3: 139 ms per loop
>>> %timeit [sum(x) for x in zip(list1, list2)]
1 loops, best of 3: 177 ms per loop


Related Topics



Leave a reply



Submit