Pivot on Multiple Columns Using Tablefunc

Pivot on Multiple Columns using Tablefunc

The problem with your query is that b and c share the same timestamp 2012-01-02 00:00:00, and you have the timestamp column timeof first in your query, so - even though you added bold emphasis - b and c are just extra columns that fall in the same group 2012-01-02 00:00:00. Only the first (b) is returned since (quoting the manual):

The row_name column must be first. The category and value columns must be the last two columns, in that order. Any columns between row_name and category are treated as "extra". The "extra" columns are expected to be the same for all rows with the same row_name value.

Bold emphasis mine.

Just revert the order of the first two columns to make entity the row name and it works as desired:

SELECT * FROM crosstab(
'SELECT entity, timeof, status, ct
FROM t4
ORDER BY 1'
,'VALUES (1), (0)')
AS ct (
"Attribute" character
,"Section" timestamp
,"status_1" int
,"status_0" int);

entity must be unique, of course.

Reiterate

  • row_name first
  • (optional) extra columns next
  • category (as defined by the second parameter) and value last.

Extra columns are filled from the first row from each row_name partition. Values from other rows are ignored, there is only one column per row_name to fill. Typically those would be the same for every row of one row_name, but that's up to you.

For the different setup in your answer:

SELECT localt, entity
, msrmnt01, msrmnt02, msrmnt03, msrmnt04, msrmnt05 -- , more?
FROM crosstab(
'SELECT dense_rank() OVER (ORDER BY localt, entity)::int AS row_name
, localt, entity -- additional columns
, msrmnt, val
FROM test
-- WHERE ??? -- instead of LIMIT at the end
ORDER BY localt, entity, msrmnt
-- LIMIT ???' -- instead of LIMIT at the end
, $$SELECT generate_series(1,5)$$) -- more?
AS ct (row_name int, localt timestamp, entity int
, msrmnt01 float8, msrmnt02 float8, msrmnt03 float8, msrmnt04 float8, msrmnt05 float8 -- , more?
)
LIMIT 1000 -- ??!!

No wonder the queries in your test perform terribly. Your test setup has 14M rows and you process all of them before throwing most of it away with LIMIT 1000. For a reduced result set add WHERE conditions or a LIMIT to the source query!

Plus, the array you work with is needlessly expensive on top of it. I generate a surrogate row name with dense_rank() instead.

db<>fiddle here - with a simpler test setup and fewer rows.

Pivot table without crosstab/tablefunc

With conditional aggregation:

select 
id, author,
sum(size) size,
sum((file_ext = 'python')::int) python,
sum((file_ext = 'cpp')::int) cpp,
sum((file_ext = 'non-code')::int) "non-code"
from tablename
group by id, author

See the demo.

Results:

> id | author |  size | python | cpp | non-code
> -: | :----- | ----: | -----: | --: | -------:
> 1 | a | 28385 | 1 | 1 | 1
> 2 | b | 10633 | 2 | 1 | 1
> 3 | c | 2338 | 1 | 1 | 0

Pivot table using crosstab and count

1. Static solution with a limited list of marking values :

SELECT year
, TO_CHAR( creation_date, 'Month') AS month
, COUNT(*) FILTER (WHERE marking = 'Delivered') AS Delivered
, COUNT(*) FILTER (WHERE marking = 'Not delivered') AS "Not delivered"
, COUNT(*) FILTER (WHERE marking = 'Not Received') AS "Not Received"
FROM invoices
GROUP BY 1,2

2. Full dynamic solution with a large list of marking values :

This proposal is an alternative solution to the crosstab solution as proposed in A and B.

The proposed solution here just requires a dedicated composite type which can be dynamically created and then it relies on the jsonb type and standard functions :

Starting from your query which counts the number of rows per year, month and marking value :

  • Using the jsonb_object_agg function, the resulting rows are first
    aggregated by year and month into jsonb objects whose jsonb keys
    correspond to the marking values and whose jsonb values
    correspond to the counts.
  • the resulting jsonb objects are then converted into records using the jsonb_populate_record function and the dedicated composite type.

First we dynamically create a composite type which corresponds to the ordered list of marking values :

CREATE OR REPLACE PROCEDURE create_composite_type() LANGUAGE plpgsql AS $$
DECLARE
column_list text ;
BEGIN
SELECT string_agg(DISTINCT quote_ident(marking) || ' bigint', ',' ORDER BY quote_ident(marking) || ' bigint' ASC)
INTO column_list
FROM invoices ;

EXECUTE 'DROP TYPE IF EXISTS composite_type' ;
EXECUTE 'CREATE TYPE composite_type AS (' || column_list || ')' ;
END ;
$$ ;

CALL create_composite_type() ;

Then the expected result is provided by the following query :

SELECT a.year
, TO_CHAR(a.year_month, 'Month') AS month
, (jsonb_populate_record( null :: composite_type
, jsonb_object_agg(a.marking, a.count)
)
).*
FROM
( SELECT year
, date_trunc('month', creation_date) AS year_month
, marking
, count(*) AS count
FROM invoices AS v
GROUP BY 1,2,3
) AS a
GROUP BY 1,2
ORDER BY month

Obviously, if the list of marking values may vary in time, then you have to recall the create_composite_type() procedure just before executing the query. If you don't update the composite_type, the query will still work (no error !) but some old marking values may be obsolete (not used anymore), and some new marking values may be missing in the query result (not displayed as columns).

See the full demo in dbfiddle.

How to pivot multiple columns in postgres

Use a union query:

select date, 'a1' as metric, a1_on_time as on_time, a1_days as days from your_table
union all
select date, 'b2', b2_on_time, b2_days from your_table
order by metric, date;

How to create a PostgreSQL pivot table that pivots multiple columns?

You can try using conditonal aggregation

select system,MICROSERVICE , MONTH,
max(case when METRIC='uptime' then VALUE end) as uptime_value,
max(case when METRIC='uptime' then CONFIDENCE_LEVEL end) as uptime_confidence,
max(case when METRIC='lag' then VALUE end) as lag_value,
max(case when METRIC='lag' then CONFIDENCE_LEVEL end) as lag_confidence
from tablename
group by system,MICROSERVICE , MONTH

Postgres CROSSTAB Query not able to get all the expected Columns

You could just use conditional aggregation:

select student,
max(marks) filter (where subject = 'Music') as music,
max(marks) filter (where subject = 'Maths') as maths,
max(marks) filter (where subject = 'History') as history,
max(marks) filter (where subject = 'Language') as language,
max(marks) filter (where subject = 'Geography') as geography,
rollno
from t
group by student, rollno;


Related Topics



Leave a reply



Submit