Postgresql - Combining Multiple Rows With Several Attributes into One Row

PostgreSQL - Combining multiple rows with several attributes into one row?

If you want to avoid joins I'd offer such a construction

WITH cte AS (
SELECT DATE, ID, ScoreA, ScoreB, ScoreC,
row_number() over (partition by ID order by ScoreA desc) rnA,
row_number() over (partition by ID order by ScoreB desc) rnB,
row_number() over (partition by ID order by ScoreC desc) rnC,
FROM ...
WHERE DATE BETWEEN ... AND ...
), ids AS (
SELECT DISTINCT ID FROM cte
)
SELECT ID,
(SELECT ScoreA FROM cte t2 WHERE t2.ID = t.ID AND rnA = 1) ScoreA,
(SELECT DATE FROM cte t2 WHERE t2.ID = t.ID AND rnA = 1) Date_A,
(SELECT ScoreB FROM cte t2 WHERE t2.ID = t.ID AND rnB = 1) ScoreB,
(SELECT DATE FROM cte t2 WHERE t2.ID = t.ID AND rnB = 1) Date_B,
(SELECT ScoreC FROM cte t2 WHERE t2.ID = t.ID AND rnC = 1) ScoreC,
(SELECT DATE FROM cte t2 WHERE t2.ID = t.ID AND rnC = 1) Date_C
FROM ids t

When you need date or some other attribute of max/min value it is reasonable to use row numbering instead of aggregating functions: row_number() over (...) as rn followed by condition rn = 1

UPD

As soon as @TaurusDang wants to have code generation, there is my solution to allow postgres to do almost all the work:

WITH cols AS
(
SELECT column_name
FROM information_schema.columns
WHERE table_schema = 'your_schema'
AND table_name = 'your_table'
AND column_name like 'Score%'
)
-- first part: rows for cte subquery
SELECT ',row_number() over (partition by ID order by ' || column_name || ' desc) rn' || column_name
FROM cols
UNION ALL
-- second part: rows for final query
SELECT ',(SELECT ' || column_name || ' FROM cte t2 WHERE t2.ID = t.ID AND rn' || column_name || ' = 1) ' || column_name || ', (SELECT DATE FROM cte t2 WHERE t2.ID = t.ID AND rn' || column_name || ' = 1) Date_' || column_name
FROM cols

Just copy generated rows into the initial query: the first half to cte and the second one to the main query

Combining multiple rows into single row in postgresql

Your desired result is not 1NF as you want to return a list of results within a single record (i.e. for a key 01102027001|ABRA|--|--, you would want to get ITEM NAME|10 | EMERY | EMERY | Y | --, TYPE |20 | -- | -- | Y | test, and so on). With purely relational sql queries this is hard (or even impossible) to achieve, and it isactually very very seldom desired that way.

Relational databases (and prostgresql) can deal with NON-1NF structures, e.g. by using nested tables or XML output. However, it's probably much easier to adapt the presentation layer such that the common part 01102027001|ABRA|--|-- is printed just once.

Postgresql combine multiple rows into multiple columns

You can use conditional aggregation. Postgres provides filter for this purpose:

select tm.name, t.id,
max(t.text) filter (where t.language_id = 'bg') as bg,
max(t.text) filter (where t.language_id = 'cs') as cs,
. . .
from translate_dev.tab_translation t join
modules tm
on tm.translation_id = t.id

where t.id in (. . . )
group by tm.name, t.id

Concatenate multiple result rows of one column into one, group by another column

Simpler with the aggregate function string_agg() (Postgres 9.0 or later):

SELECT movie, string_agg(actor, ', ') AS actor_list
FROM tbl
GROUP BY 1;

The 1 in GROUP BY 1 is a positional reference and a shortcut for GROUP BY movie in this case.

string_agg() expects data type text as input. Other types need to be cast explicitly (actor::text) - unless an implicit cast to text is defined - which is the case for all other string types (varchar, character, name, ...) and some other types.

As isapir commented, you can add an ORDER BY clause in the aggregate call to get a sorted list - should you need that. Like:

SELECT movie, string_agg(actor, ', ' ORDER BY actor) AS actor_list
FROM tbl
GROUP BY 1;

But it's typically faster to sort rows in a subquery. See:

  • Create array in SELECT

Select multiple row values into single row with multi-table clauses

If each attribute only has a single value for a user, you can start by making a sparse matrix:

SELECT user_id
,CASE WHEN attrib_id = 1 THEN value ELSE NULL END AS attrib_1_val
,CASE WHEN attrib_id = 2 THEN value ELSE NULL END AS attrib_2_val
FROM UserAttribute;

Then compress the matrix using an aggregate function:

SELECT user_id
,MAX(CASE WHEN attrib_id = 1 THEN value ELSE NULL END) AS attrib_1_val
,MAX(CASE WHEN attrib_id = 2 THEN value ELSE NULL END) AS attrib_2_val
FROM UserAttribute
GROUP BY user_id;

In response to the comment, searching by attribute name rather than id:

SELECT ua.user_id
,MAX(CASE WHEN a.attrib_name = 'attrib1' THEN ua.value ELSE NULL END) AS attrib_1_val
,MAX(CASE WHEN a.attrib_name = 'attrib2' THEN ua.value ELSE NULL END) AS attrib_2_val
FROM UserAttribute ua
JOIN Attribute a ON (a.attrib_id = ua.attrib_id)
WHERE a.attrib_name IN ('attrib1', 'attrib2')
GROUP BY ua.user_id;

Merge multiple rows into one with more than one row value in a column

From your current state you simply can do the pivot using the FILTER clause:

demo:db<>fiddle

SELECT
response,
document,
MAX(bill) FILTER (WHERE label = 'bill') as bill,
MAX(answer) FILTER (WHERE label = 'amount') as amount,
MAX(product) FILTER (WHERE label = 'product') as product,
MAX(answer) FILTER (WHERE label = 'price') as price
FROM t
GROUP BY response, document

I am not quite sure, how your original table looks like. If it is more like this:

response | document | label   | value
-------: | -------: | :------ | :----
71788176 | 79907201 | bill | 26899
71788176 | 79907201 | amount | 1
71788176 | 79907201 | product | shoes
71788176 | 79907201 | price | 25.99

Then you can modify the query like this:

demo:db<>fiddle

SELECT
response,
document,
MAX(value) FILTER (WHERE label = 'bill') as bill,
MAX(value) FILTER (WHERE label = 'amount') as amount,
MAX(value) FILTER (WHERE label = 'product') as product,
MAX(value) FILTER (WHERE label = 'price') as price
FROM t
GROUP BY response, document

Edit: TO added the JSON value to product column:

demo:db<>fiddle

Variant 1: You could simply cast the type json into type text:

MAX(product::text) FILTER (WHERE label = 'product') as product,

Variant 2: You read the value from the "name" attribute:

MAX(product ->> 'name') FILTER (WHERE label = 'product') as product,

PostgresQL for each row, generate new rows and merge

The set-returning function generate_series does exactly that:

SELECT
id,
generate_series(1, (max-min)/2+1) AS index,
generate_series(min, max, 2) AS value
FROM
example;

(online demo)

The index can alternatively be generated with RANK() (example, see also @a_horse_­with_­no_­name's answer) if you don't want to rely on the parallel sets.

How to combine multiple rows with similar ID but varying attributes into a single row using SQL?

You can use a group by clause to reduce the rows and perform aggregation(s) of rank.

SELECT isrc
, song_name
, count(*) as num_charts
, sum(cast rank AS INT) as sum_rank
, avg(cast rank AS decimal(10, 2)) as avg_rank
FROM (
(
SELECT rank
, isrc
, song_name
FROM `myTable`
WHERE chart_country = 'US'
)

UNION ALL ## use "union all" if summing or counting the resultset

(
SELECT rank
, isrc
, song_name
FROM `myTable2`
WHERE chart_country = 'US'
)
)
GROUP BY
isrc
, song_name
ORDER BY
sum_rank DESC
, avg_rank DESC

note: I'm not sure why you order by casting rank to an integer, if that column is not defined as numeric then you open the possibility of conversion error - but I have included casts "just in case".

Please note that "union", when used by itself removes duplicate rows, hence it has the potential to upset accurate sums, counts or averages etc. So instead use "union all" which does not attempt to remove duplicate rows (and can be a bit faster because of this).

I have removed 2 columns as they won't summarise well although you could use GROUP_CONCAT(SOURCE) (as noted by PM-77-1) and you could use MIN() or MAX() for the date column if that would be of use.



Related Topics



Leave a reply



Submit