How to Use Array_Agg() for Varchar[]

How to use array_agg() for varchar[]

The standard aggregate function array_agg() only works for base types, not array types as input.
(But Postgres 9.5+ has a new variant of array_agg() that can!)

You could use the custom aggregate function array_agg_mult() as defined in this related answer:

Selecting data into a Postgres array

Create it once per database. Then your query could work like this:

SELECT use.user_sched_id, array_agg(se.sched_entry_id) AS seids
,array_agg_mult(ARRAY[se.min_crew]) AS min_crew_arr
FROM base.sched_entry se
LEFT JOIN base.user_sched_entry use USING (sched_entry_id)
WHERE se.sched_entry_id = ANY(ARRAY[623, 625])
GROUP BY user_sched_id;

There is a detailed rationale in the linked answer.

Extents have to match

In response to your comment, consider this quote from the manual on array types:

Multidimensional arrays must have matching extents for each dimension.
A mismatch causes an error.

There is no way around that, the array type does not allow such a mismatch in Postgres. You could pad your arrays with NULL values so that all dimensions have matching extents.

But I would rather translate the arrays to a comma-separated lists with array_to_string() for the purpose of this query and use string_agg() to aggregate the text - preferably with a different separator. Using a newline in my example:

SELECT use.user_sched_id, array_agg(se.sched_entry_id) AS seids
,string_agg(array_to_string(se.min_crew, ','), E'\n') AS min_crews
FROM ...

Normalize

You might want to consider normalizing your schema to begin with. Typically, you would implement such an n:m relationship with a separate table like outlined in this example:

How to implement a many-to-many relationship in PostgreSQL?

array_agg for Array Types

You could write custom aggregate to handle your specific array of arrays, e.g.:

DROP TABLE IF EXISTS e;
CREATE TABLE e
(
id serial PRIMARY KEY,
alert_type text,
date_happened timestamp with time zone
);

INSERT INTO e(alert_type, date_happened) VALUES
('red', '2011-05-10 10:15:06'),
('yellow', '2011-06-22 20:01:19');

CREATE OR REPLACE FUNCTION array_agg_custom_cut(anyarray)
RETURNS anyarray
AS 'SELECT $1[2:array_length($1, 1)]'
LANGUAGE SQL IMMUTABLE;

DROP AGGREGATE IF EXISTS array_agg_custom(anyarray);
CREATE AGGREGATE array_agg_custom(anyarray)
(
SFUNC = array_cat,
STYPE = anyarray,
FINALFUNC = array_agg_custom_cut,
INITCOND = $${{'', '', ''}}$$
);

Query:

SELECT
array_agg_custom(
ARRAY[
alert_type::text,
id::text,
CAST(extract(epoch FROM date_happened) AS text)
])
FROM e;

Result:

              array_agg_custom              
--------------------------------------------
{{red,1,1305036906},{yellow,2,1308787279}}
(1 row)

EDIT:

Here is second, shorter way (that is, you don't need array_agg_custom_cut function, but as you see additional ARRAY level is necessary in query):

CREATE AGGREGATE array_agg_custom(anyarray)
(
SFUNC = array_cat,
STYPE = anyarray
);

SELECT
array_agg_custom(
ARRAY[
ARRAY[
alert_type::text,
id::text,
CAST(extract(epoch FROM date_happened) AS text)
]
])
FROM e;

Result:

              array_agg_custom              
--------------------------------------------
{{red,1,1305036906},{yellow,2,1308787279}}
(1 row)

how to make array_agg() work like group_concat() from mySQL

In PostgreSQL 8.4 you cannot explicitly order array_agg but you can work around it by ordering the rows passed into to the group/aggregate with a subquery:

SELECT id, array_to_string(array_agg(image), ',')
FROM (SELECT * FROM test ORDER BY id, rank) x
GROUP BY id;

In PostgreSQL 9.0 aggregate expressions can have an ORDER BY clause:

SELECT id, array_to_string(array_agg(image ORDER BY rank), ',')
FROM test
GROUP BY id;

Sort a text aggregate created with array_agg in postgresql

This will be available in PostgreSQL 9.0:

http://www.postgresql.org/docs/9.0/static/release-9-0.html, Section E.1.3.6.1. Aggregates

In the meantime, you could do something like this which may solve the problem (albeit clunky):

SELECT array_agg(animal_name)
FROM (
SELECT "name" AS animal_name
FROM animals
ORDER BY "name"
) AS sorted_animals;

Alternatives of array_agg() or string_agg() on redshift

you have to use listagg for reshift

For each group in a query, the LISTAGG aggregate function orders the rows for that group according to the ORDER BY expression, then concatenates the values into a single string.

LISTAGG is a compute-node only function. The function returns an error if the query doesn't reference a user-defined table or Amazon Redshift system table.

Your query will be as like below

select _bs, 
listagg(_wbns,',')
within group (order by _wbns) as val
from bag
group by _bs
order by _bs;

for better understanding Listagg

Selecting data into a Postgres array

You cannot use array_agg() to produce multi-dimensional arrays, at least not up to PostgreSQL 9.4.

(But the upcoming Postgres 9.5 ships a new variant of array_agg() that can!)

What you get out of @Matt Ball's query is an array of records (the_table[]).

An array can only hold elements of the same base type. You obviously have number and string types. Convert all columns (that aren't already) to text to make it work.

You can create an aggregate function for this like I demonstrated to you here before.

CREATE AGGREGATE array_agg_mult (anyarray)  (
SFUNC = array_cat
,STYPE = anyarray
,INITCOND = '{}'
);

Call:

SELECT array_agg_mult(ARRAY[ARRAY[name, id::text, url]]) AS tbl_mult_arr
FROM tbl;

Note the additional ARRAY[] layer to make it a multidimensional array (2-dimenstional, to be precise).

Instant demo:

WITH tbl(id, txt) AS (
VALUES
(1::int, 'foo'::text)
,(2, 'bar')
,(3, '}b",') -- txt has meta-characters
)
, x AS (
SELECT array_agg_mult(ARRAY[ARRAY[id::text,txt]]) AS t
FROM tbl
)
SELECT *, t[1][3] AS arr_element_1_1, t[3][4] AS arr_element_3_2
FROM x;

PostgreSQL retrieve records with items in array with array_agg

It must be something with parsing SQL by Postgres, because with extra cast to varchar[] it works fine:

 SELECT * FROM items
WHERE item_code = ANY((
SELECT array_agg(DISTINCT item_code)
FROM customer_orders
WHERE required_shipping_date BETWEEN '2013-12-01' AND '2013-12-15')::varchar[]);

Anyway, there is no need to use array in this case:

SELECT * FROM items
WHERE item_code IN (
SELECT DISTINCT item_code
FROM customer_orders
WHERE required_shipping_date BETWEEN '2013-12-01' AND '2013-12-15');

Pad arrays with NULL to maximum length for custom aggregate function

Using the custom aggregate function array_agg_mult() like defined in this related answer:

  • Selecting data into a Postgres array

Your expected result is impossible:

{{1},NULL,{abc}}

Would have to be:

{{1},{NULL},{abc}}

Simple case with 0 or 1 array elements

For the simple case to just replace the empty array:
You can achieve that with:

WITH t(arr) AS (
VALUES
('{1}'::text[])
,('{}')
,('{abc}')
)
SELECT array_agg_mult(ARRAY[CASE WHEN arr = '{}' THEN '{NULL}' ELSE arr END])
FROM t;

Dynamic padding for n elements

Using array_fill() to pad arrays with NULL elements up to the maximum length:

SELECT array_agg_mult(ARRAY[
arr || array_fill(NULL::text
, ARRAY[max_elem - COALESCE(array_length(arr, 1), 0)])
]) AS result
FROM t, (SELECT max(array_length(arr, 1)) AS max_elem FROM t) t1;

Still only works for 1-dimensional basic arrays.

Explain

  • Subquery t1 computes the maximum length of the basic 1-dimensional array.
  • COALESCE(array_length(arr, 1), 0) computes the length of the array in this row.

    COALESCE defaults to 0 for NULL.
  • Generate padding array for the difference in length with array_fill().
  • Append that to arr with ||
  • Aggregate like above with array_agg_mult().

SQL Fiddle. demonstrating all.

Output in SQL Fiddle is misleading, so I cast result to text there.

How to get an empty array in array_agg if condition is not met?

Please check out below answer and let me know whether it returns your desired output or not:

Schema and insert statements:

     create table users_collections (user_id int, collection_id int, access varchar(20));
insert into users_collections values(3, 1, 'allow');
insert into users_collections values(3, 2, 'allow');
insert into users_collections values(4, 3, 'allow');
insert into users_collections values(3, 5, 'not allow');


create table collections_books (collection_id int, book_id int);
insert into collections_books values(2,24);
insert into collections_books values(3,35);
insert into collections_books values(3,25);
insert into collections_books values(1,36);
insert into collections_books values(1,22);
insert into collections_books values(1,24);
insert into collections_books values(2,34);
insert into collections_books values(5,344);
insert into collections_books values(6,474);

Query:

     SELECT c.collection_id, (CASE WHEN max(u.access) = 'allow' AND max(u.user_id) = 3
THEN ARRAY_AGG(c.book_id)
ELSE '{null}'::int[] END)
FROM collections_books AS c
LEFT JOIN users_collections AS u
ON c.collection_id = u.collection_id
GROUP BY c.collection_id;

Output:

 |collection_id | case      |
|------------: | :---------|
| 3 | {35,25} |
| 5 | {NULL} |
| 6 | {NULL} |
| 2 | {24,34} |
| 1 | {36,24,22}|

db<fiddle here



Related Topics



Leave a reply



Submit