Concatenate Multiple Rows in an Array with SQL on Postgresql

Concatenate multiple rows in an array with SQL on PostgreSQL

This is a Postgres built-in since a few versions so you no longer need to define your own, the name is array_agg().

test=> select array_agg(n) from generate_series(1,10) n group by n%2;
array_agg
--------------
{1,3,5,7,9}
{2,4,6,8,10}

(this is Postgres 8.4.8).

Note that no ORDER BY is specified, so the order of the result rows depends on the grouping method used (here, hash) ie, it is not defined. Example:

test=> select n%2, array_agg(n) from generate_series(1,10) n group by (n%2);
?column? | array_agg
----------+--------------
1 | {1,3,5,7,9}
0 | {2,4,6,8,10}

test=> select (n%2)::TEXT, array_agg(n) from generate_series(1,10) n group by (n%2)::TEXT;
text | array_agg
------+--------------
0 | {2,4,6,8,10}
1 | {1,3,5,7,9}

Now, I don't know why you get {10,2,4,6,8} and {9,7,3,1,5}, since generate_series() should send the rows in order.

PostgreSQL 9.1: How to concatenate rows in array without duplicates, JOIN another table

Instead of using window functions and partitioning, use a query-level GROUP BY and aggregate with a DISTINCT clause:

SELECT         
rnp.grp_id,
array_to_string(array_agg(distinct rnp.cabinets),',') AS cabinets,
array_to_string(array_agg(distinct ips.address),',') AS addresses
FROM rnp JOIN ips ON rnp.grp_id=ips.grp_id GROUP BY rnp.grp_id, ips.grp_id;

Result:

 grp_id |        cabinets         | addresses 
--------+-------------------------+-----------
11 | cabs1,cabs2,cabs3,cabs4 | CA,NY
22 | c1,c2 | DC,LA
(2 rows)

The key here is that instead of using window functions and patitioning, you use a query-level GROUP BY and aggregate with a DISTINCT clause.

This'd work with the window function approach too, except that PostgreSQL (9.1 at least) doesn't support DISTINCT in window functions:

regress=# SELECT DISTINCT
rnp.grp_id,
array_to_string(array_agg(distinct rnp.cabinets)OVER (PARTITION BY rnp.grp_id), ',') AS cabinets,
array_to_string(array_agg(distinct ips.address) OVER (PARTITION BY ips.grp_id), ',') AS addresses
FROM rnp JOIN ips ON rnp.grp_id=ips.grp_id;
ERROR: DISTINCT is not implemented for window functions
LINE 3: array_to_string(array_agg(distinct rnp.cabinets)OVER (PART...

Concatenate multiple result rows of one column into one, group by another column

Simpler with the aggregate function string_agg() (Postgres 9.0 or later):

SELECT movie, string_agg(actor, ', ') AS actor_list
FROM tbl
GROUP BY 1;

The 1 in GROUP BY 1 is a positional reference and a shortcut for GROUP BY movie in this case.

string_agg() expects data type text as input. Other types need to be cast explicitly (actor::text) - unless an implicit cast to text is defined - which is the case for all other string types (varchar, character, name, ...) and some other types.

As isapir commented, you can add an ORDER BY clause in the aggregate call to get a sorted list - should you need that. Like:

SELECT movie, string_agg(actor, ', ' ORDER BY actor) AS actor_list
FROM tbl
GROUP BY 1;

But it's typically faster to sort rows in a subquery. See:

  • Create array in SELECT

Concatenate rows in function PostgreSQL

You do not need PL/pgSQL for that.

First eliminate duplicate names using DISTINCT and then in a subquery you can concat the columns into a single string. After that use array_agg to create an array out of it. It will then "merge" multiple arrays, in case the subquery returns more than one row. Finally, get rid of the commas and curly braces using array_to_string. Instead of using the char value of a newline, you can simply use E'\n' (E stands for escape):

WITH j (name,location,team_id,start,end_) AS (
VALUES ('Library','Atlanta',2389,2015,2017),
('Library','Georgetown',9920,2003,2007),
('Museum','Auckland',3092,2005,2007)
)
SELECT
DISTINCT q1.name,
array_to_string(
(SELECT array_agg(concat(location,', ',team_id,', ',start,'-', end_, E'\n'))
FROM j WHERE name = q1.name),'') AS records
FROM j q1;

name | records
---------+----------------------------
Library | Atlanta, 2389, 2015-2017
| Georgetown, 9920, 2003-2007
|
Museum | Auckland, 3092, 2005-2007
  • Note: try to not use reserved strings (e.g. end,name,start, etc.) to name your columns. Although PostgreSQL allows you to use them, it is considered a bad practice.

Demo: db<>fiddle

Concatenate few rows into a single row with comma seperated using string_agg in postgreSQL

Give this a shot:

select visit_id, string_agg(page_name, ',' order by sequence_number) as cart_items                        
from events e
join page_hierarchy ph on e.page_id = ph.page_id
join event_identifier ei on ei.event_type = e.event_type
where event_name = 'Add to Cart'
group by visit_id

The order by sequence_number within the string_agg(...) function will sort your comma separated output based on sequence number.

Here's an example with the sample data you provided:
https://dbfiddle.uk/?rdbms=postgres_13&fiddle=52077d5be605a51d3a7bb14152a392df

Here're the results of that:


visit_id | cart_items
:------- | :------------
d58cbd | Kingfish,Tuna

PostgreSQL: how to combine multiple rows?

I suggest the following approach:

SELECT client_id, array_agg(result) AS results
FROM labresults
GROUP BY client_id;

It's not exactly the same output format, but it will give you the same information much faster and cleaner.

If you want the results in separate columns, you can always do this:

SELECT client_id,
results[1] AS result1,
results[2] AS result2,
results[3] AS result3
FROM
(
SELECT client_id, array_agg(result) AS results
FROM labresults
GROUP BY client_id
) AS r
ORDER BY client_id;

although that will obviously introduce a hardcoded number of possible results.

Postgres merge two rows with common array elements

Can procedure suffice?

CREATE OR REPLACE PROCEDURE add_ids(new_ids INT[])
AS $$
DECLARE sum_array INT[];
BEGIN
SELECT ARRAY (SELECT UNNEST(ids) FROM table1 WHERE table1.ids && new_ids) INTO sum_array;
sum_array := sum_array || new_ids;
SELECT ARRAY(SELECT DISTINCT UNNEST(sum_array)) INTO sum_array;
DELETE FROM table1 WHERE table1.ids && sum_array;
INSERT INTO table1(ids) SELECT sum_array;
END;
$$
LANGUAGE plpgsql;

Unfortunately inserting row inside trigger calls another trigger causing infinitie loop. I do not know work around that.

PS. Sorry if creating another answer is bad practice. I want to leave it for now for reference. I will delete it when the problem is resolved.

Edit by pewpewlasers:

To prevent the loop another table is probably needed. I have created a new temporary table2. New arrays can be added to this table. This table will have a trigger which does the calculations and saves it to table1. It also deletes this temporarily created row.

CREATE OR REPLACE FUNCTION  on_insert_temp() RETURNS TRIGGER AS $f$
DECLARE sum_array BIGINT[];
BEGIN
SELECT ARRAY (SELECT UNNEST(ids) FROM table1 WHERE table1.ids && NEW.ids) INTO sum_array;
sum_array := sum_array || NEW.ids;
SELECT ARRAY(SELECT DISTINCT UNNEST(sum_array)) INTO sum_array;
DELETE FROM table1 WHERE table1.ids && sum_array;
INSERT INTO table1(ids) SELECT sum_array;
DELETE FROM table2 WHERE id = NEW.id;
RETURN OLD;
END
$f$ LANGUAGE plpgsql;

CREATE TRIGGER on_insert_temp AFTER INSERT ON table2 FOR EACH ROW EXECUTE PROCEDURE on_insert_temp();


Related Topics



Leave a reply



Submit