Postgresql Aggregate Array

Postgresql aggregate array

Use array_agg: http://www.sqlfiddle.com/#!1/5099e/1

SELECT s.name,  array_agg(g.Mark) as marks        
FROM student s
LEFT JOIN Grade g ON g.Student_id = s.Id
GROUP BY s.Id

By the way, if you are using Postgres 9.1, you don't need to repeat the columns on SELECT to GROUP BY, e.g. you don't need to repeat the student name on GROUP BY. You can merely GROUP BY on primary key. If you remove the primary key on student, you need to repeat the student name on GROUP BY.

CREATE TABLE grade
(Student_id int, Mark varchar(2));

INSERT INTO grade
(Student_id, Mark)
VALUES
(1, 'A'),
(2, 'B'),
(2, 'B+'),
(3, 'C'),
(3, 'A');

CREATE TABLE student
(Id int primary key, Name varchar(5));

INSERT INTO student
(Id, Name)
VALUES
(1, 'John'),
(2, 'David'),
(3, 'Will');

postgresql Aggregate arrays into a single array with union of all elements

Something like this should work:

with mytable as
(
select 1 as id, array[1, 2, 3, 4] as myarray

union all

select 1 as id, array[5, 6] as myarray

union all

select 1 as id, array[7, 8] as myarray
)

select
id,
array_agg(elements order by elements)
from mytable, unnest(myarray) as elements
group by id

There's some discussion about building the custom function here: Concatenate/merge array values during grouping/aggregation

PostgreSQL: Efficiently aggregate array columns as part of a group by

demo:db<>fiddle

Assuming each record contains a unique id value:

SELECT
fruit,
array_agg(DISTINCT flavor), -- 2
COUNT(DISTINCT id) -- 3
FROM
example,
unnest(flavors) AS flavor -- 1
GROUP BY fruit
  1. unnest() array elements
  2. Group by fruit value: array_agg() for distinct flavors
  3. Group by fruit value: COUNT() for distinct ids with each fruit group.

if the id column is really empty, you could generate the id values for example with the row_number() window function:

demo:db<>fiddle

SELECT
*
FROM (
SELECT
*, row_number() OVER () as id
FROM example
) s,
unnest(flavors) AS flavor

Postgres: aggregate column into array

Here is a better solution as suggested by Abelisto in the comments:

select
list_id,
name,
array_agg(game_id order by position)
from list
join list_item
using (list_id)
group by list_id, name

Nested aggregate function in PostgreSQL

In your question so many things are unclear. Based on what i understood from your current query, try this:

with cte as (
SELECT ag.article_group_id,
ag.article_group,
au.author_id,
au.author_name,
avg(gr.total) as avg_total
FROM article_group ag
LEFT JOIN article ar on ar.article_group_id=ag.article_group_id
LEFT JOIN article_to_author ata ON ar.article_id = ata.article_id
LEFT JOIN author au ON ata.author_id = au.author_id
LEFT JOIN grade gr ON ar.article_id = gr.article_id
GROUP BY ag.article_group_id, ag.article_group, au.author_id, au.author_name
)
SELECT article_group_id,
article_group,
array_agg('[' || author_id || ',' || author_name || ',' || avg_total || ']')
FROM cte
GROUP BY article_group_id, article_group

You can change whatever you want in array_agg

DEMO

How can I apply aggregate functions element-wise over arrays in PostgreSQL, e.g. weighted array sums over a group?

I would write functions for that, otherwise the SQL will get really messy.

One function to multiply all elements with a given value:

create function array_mul(p_input real[], p_mul int)
returns real[]
as
$$
select array(select i * p_mul
from unnest(p_input) with ordinality as t(i,idx)
order by idx);
$$
language sql
immutable;

And one function to be used as an aggregate that sums up the elements with the same index:

create or replace function array_add(p_one real[], p_two real[])
returns real[]
as
$$
declare
l_idx int;
l_result real[];
begin
if p_one is null or p_two is null then
return coalesce(p_one, p_two);
end if;

for l_idx in 1..greatest(cardinality(p_one), cardinality(p_two)) loop
l_result[l_idx] := coalesce(p_one[l_idx],0) + coalesce(p_two[l_idx], 0);
end loop;

return l_result;
end;
$$
language plpgsql
immutable;

That can be used to define a custom aggregate:

create aggregate array_element_sum(real[]) (
sfunc = array_add,
stype = real[],
initcond = '{}'
);

And then your query is as simple as:

select grp, array_element_sum(array_mul(vals, n))
from tbl
group by grp;

Online example

PostgreSQL aggregate over json arrays

Use the function for orders->'items' to flatten the data:

select elem->>'name' as name, (elem->>'price')::numeric as price
from my_table
cross join jsonb_array_elements(orders->'items') as elem;

It is easy to get the aggregates you want from the flattened data:

select name, count(*), sum(price) as total_value_sold
from (
select elem->>'name' as name, (elem->>'price')::numeric as price
from my_table
cross join jsonb_array_elements(orders->'items') as elem
) s
group by name;

Db<>fiddle.

Aggregate single array of distinct elements from array column, excluding NULL

Plain array_agg() does this with arrays:

Concatenates all the input arrays into an array of one higher
dimension. (The inputs must all have the same dimensionality, and
cannot be empty or null.)

Not what you need. See:

  • Is there something like a zip() function in PostgreSQL that combines two arrays?

You need something like this: unnest(), process and sort elements an feed the resulting set to an ARRAY constructor:

SELECT ARRAY(
SELECT DISTINCT elem::date
FROM (SELECT unnest(date_array) FROM example) AS e(elem)
WHERE elem IS NOT NULL
ORDER BY elem DESC
);

db<>fiddle here

To be clear: we could use array_agg() (taking non-array input, different from your incorrect use) instead of the final ARRAY constructor. But the latter is faster (and simpler, too, IMO).

They happen to be timestamps of just dates (eg without time or timezone)

So cast to date and trim the noise.

Should be the fastest way:

  • A correlated subquery is a bit faster than a LATERAL one (and does the simple job).
  • An ARRAY constructor is a bit faster than the aggregate function array_agg() (and does the simple job).
  • Most importantly, sorting and applying DISTINCT in a subquery is typically faster than inline ORDER BY and DISTINCT in an aggregate function (and does the simple job).

See:

  • Unnest arrays of different dimensions

  • How to select 1d array from 2d array?

  • Why is array_agg() slower than the non-aggregate ARRAY() constructor?

  • What is the difference between LATERAL JOIN and a subquery in PostgreSQL?

Performance comparison:

db<>fiddle here



Related Topics



Leave a reply



Submit