Turning arbitrarily many rows into columns in PostgreSQL
The main problem with pivot tables in Postgres (and other RDBMS) is that the structure (number and names of columns) of a query result cannot vary depending on the selected data. One of the possible solutions is to dynamically create a view, which structure is defined by the data. The example function creates a view based on the table example_table
:
create or replace function create_pivot_view()
returns void language plpgsql as $$
declare
list text;
begin
select string_agg(format('jdata->>%1$L "%1$s"', name), ', ')
from (
select distinct name
from example_table
) sub
into list;
execute format($f$
drop view if exists example_pivot_view;
create view example_pivot_view as
select lbl, %s
from (
select lbl, json_object_agg(name, value) jdata
from example_table
group by 1
order by 1
) sub
$f$, list);
end $$;
Use the function after the table is modified (maybe in a trigger) and query the created view:
select create_pivot_view();
select *
from example_pivot_view;
lbl | num | colour | percentage
-----+-----+--------+------------
1 | 1 | Red | 25.0
2 | 2 | Green | 50.0
3 | 3 | Blue | 75.0
(3 rows)
Test it in db<>fiddle.
Note, that it's necessary to recreate a view (call the function) only after a new name is added to the table (or some name is removed from it). If the set of distinct names doesn't change you can query the view without recreating it. If the set is modified frequently creating a temporary view would be a better option.
You may be also interested in Flatten aggregated key/value pairs from a JSONB field?
Combine multiple rows into multiple columns in PostgreSQL
You need group by item_code to get a single row per group, then use an aggregate like max() to get the non-null value for each group:
select
item_code,
max(
case name_type
when 'full' then name_value
end
) as full_name,
max(
case name_type
when 'short' then name_value
end
) as short_name,
max(
case name_type
when 'half' then name_value
end
) as half_name
from t
group by 1;
Transform Postgres rows data into columns based on condition
Below is the query to create the respective table and insert some data.
begin;
create table trans_custom_fields(id text, _value text,transid integer );
insert into trans_custom_fields values('ACCOUNT_HOLDER_NAME','Manoj Sharma',1);
insert into trans_custom_fields values('ACCOUNT_NUMBER', '113565TTE44656', 1);
insert into trans_custom_fields values( 'RECIPT_NUMBER', '24324.', 1);
insert into trans_custom_fields values( 'ACCOUNT_HOLDER_NAME', 'Another User', 2);
insert into trans_custom_fields values('ACCOUNT_NUMBER', '35546656TRFG23', 2);
insert into trans_custom_fields values('RECIPT_NUMBER', '24324686', 2);
commit;
Now I want to do the transformation for this data and here I am going to use crosstab feature of Postgres.
SELECT *
FROM crosstab(
'SELECT transid, id, _value
FROM trans_custom_fields
ORDER BY 1,2'
) AS ct (transid int, ACCOUNT_HOLDER_NAME text, ACCOUNT_NUMBER text);
I am really thankful to crosstab example for just helping me understand and write my own answer for my question, also thank @mark who does provide the queries and resolution but that fit better as of now.
Convert multiple rows into one row with multiple columns
You can use CASE
statements for simple queries.
Or use the crosstab()
function of the tablefunc module for more complex cases and better performance.
You can find examples for both cases under this related question:
PostgreSQL Crosstab Query
Combining multiple rows into one
This would be simpler with a simpler schema:
- No domain types (what's the purpose?)
- Add an actual PK to table
medium
- Rather use a surrogate PKs (a
serial
column) instead of the multicolumn PK and FK over two domain types.
Or at least use the same (simpler) column name for columns with the same content: justalpha_id
instead ofm_alphaID
andw_alphaID
etc.
That aside, here are solutions for your setup as is:
True crosstab()
solution
There are several specific difficulties for your crosstab()
query:
- No single column that can serve as row_name.
- Multiple extra columns.
- No category column.
- No defined order for values (so I use arbitrary order instead).
Basics (read this first!):
- PostgreSQL Crosstab Query
For your special case:
- Pivot on Multiple Columns using Tablefunc
- Dynamic alternative to pivot with CASE and GROUP BY
Solution:
SELECT alphaid, numid, name_of_work, material_1, material_2, material_3
FROM crosstab(
'SELECT rn, w.alphaid, w.numid, w.name_of_work
, row_number() OVER (PARTITION BY rn) AS mat_nr -- order undefined!
, m_title AS Material_used
FROM (
SELECT w_alphaID AS alphaid, w_numID AS numid, w_title AS name_of_work
, row_number() OVER (ORDER BY w_alphaID, w_numID) AS rn
FROM works
) w
JOIN medium m ON w.alphaid = m.m_alphaID
AND w.numid = m.m_numID
ORDER BY rn, mat_nr'
, 'VALUES (1), (2), (3)' -- add more ...
)
AS ct (
rn bigint, alphaid text, numid int, name_of_work text
, material_1 text, material_2 text, material_3 text -- add more ...
);
Poor man's crosstab with standard SQL
If the additional module tablefunc cannot be installed or if top performance is not important, this simpler query does the same, slower:
SELECT w_alphaid AS alphaid, w_numid AS numid, w_title AS name_of_work
, arr[1] AS material_used_1
, arr[2] AS material_used_2
, arr[3] AS material_used_3 -- add more?
FROM works w
LEFT JOIN (
SELECT m_alphaid, m_numid, array_agg(m_title::text) AS arr
FROM medium
GROUP BY m_alphaid, m_numid
) m ON w.w_alphaid = m.m_alphaid
AND w.w_numid = m.m_numid;
The cast to
text
(orvarchar
...) is necessary because there is no predefined array type for your custom domain. Alternatively you could define the missing array type.One subtle difference to the above: using
LEFT JOIN
here instead of justJOIN
to preserve rows fromworks
that have no related materials inmedium
at all.Since you return the whole table, it's cheaper to aggregate rows in
medium
before you join. For a small selection it might be cheaper to join first and then aggregate. Related:- GROUP or DISTINCT after JOIN returns duplicates
How to use PL/pgSQL to construct a table with dynamic columns
You can dynamically create a view. The idea and the solution for a simpler case than yours is described in this answer. Please read it before continuing.
We will use the following query to create a view:
with all_locations(location) as (
select distinct location_a
from locations
union
select distinct location_b
from locations
)
select location_a as location, json_object_agg(location_b, count order by location_b) as data
from (
select a.location as location_a, b.location as location_b, count(l.*)
from all_locations a
cross join all_locations b
left join locations l on location_a = a.location and location_b = b.location
group by 1, 2
) s
group by 1
order by 1;
Results:
location | data
----------+----------------------------------------------------------------------------
Atlanta | { "Atlanta" : 0, "London" : 1, "New York" : 0, "Sydney" : 0, "Tokyo" : 0 }
London | { "Atlanta" : 0, "London" : 0, "New York" : 0, "Sydney" : 0, "Tokyo" : 0 }
New York | { "Atlanta" : 0, "London" : 0, "New York" : 0, "Sydney" : 0, "Tokyo" : 1 }
Sydney | { "Atlanta" : 0, "London" : 1, "New York" : 0, "Sydney" : 0, "Tokyo" : 0 }
Tokyo | { "Atlanta" : 0, "London" : 0, "New York" : 0, "Sydney" : 2, "Tokyo" : 0 }
(5 rows)
The list of cities will be used twice inside the function, so it is stored in the array cities
. Note, that you can replace the first query in the function to a simpler one (it is simply ordered list of distinct cities).
create or replace function create_locations_view()
returns void language plpgsql as $$
declare
cities text[];
list text;
begin
-- fill array with all cities in alphabetical order
select array_agg(location_a order by location_a)
from (
select distinct location_a
from locations
union
select distinct location_b
from locations
) s
into cities;
-- construct list of columns to use in select list
select string_agg(format($s$data->>'%1$s' "%1$s"$s$, city), ', ')
from unnest(cities) city
into list;
-- create view from select based on the above list
execute format($ex$
drop view if exists locations_view;
create view locations_view as
select location, %1$s
from (
select location_a as location, json_object_agg(location_b, count order by location_b) as data
from (
select a.location as location_a, b.location as location_b, count(l.*)
from unnest(%2$L::text[]) a(location)
cross join unnest(%2$L::text[]) b(location)
left join locations l on location_a = a.location and location_b = b.location
group by 1, 2
) s
group by 1
) s
order by 1
$ex$, list, cities);
end $$;
Use the function and select data from the created view:
select create_locations_view();
select * from locations_view;
location | Atlanta | London | New York | Sydney | Tokyo
----------+---------+--------+----------+--------+-------
Atlanta | 0 | 1 | 0 | 0 | 0
London | 0 | 0 | 0 | 0 | 0
New York | 0 | 0 | 0 | 0 | 1
Sydney | 0 | 1 | 0 | 0 | 0
Tokyo | 0 | 0 | 0 | 2 | 0
(5 rows)
I used this method several times but I have no experience with really large data, so I cannot guarantee that it is efficient.
Related Topics
Get Every Hour for a Time Range
Ms Access: How to Count Distinct Value Using Access Query
How to Re-Use Result for Select, Where and Order by Clauses
Postgresql - Order by an Array
Maximum and Minimum Number of Tuples in Natural Join
Dba_Jobs_Running: Table or View Does Not Exist When Trying to Access from Procedure
How to Make a JPA Query with Left Outer Join
SQL Server, Combining Like and In
Global Variables in SQL Statement
Finding Continuous Ranges in a Set of Numbers
How to Prevent SQL Injection in Wordpress
Combine Output Inserted.Id with Value from Selected Row
How to Implement Logging and Error Reporting in SQL Stored Procedures