Transpose Rows into Columns in Bigquery (Pivot Implementation)

Transpose rows into columns in BigQuery (Pivot implementation)

BigQuery does not support yet pivoting functions

You still can do this in BigQuery using below approach

But first, in addition to two columns in input data you must have one more column that would specify groups of rows in input that needs to be combined into one row in output

So, I assume your input table (yourTable) looks like below

**id**  **Key**                  **Value**
1 channel_title Mahendra Guru
1 youtube_id ugEGMG4-MdA
1 channel_id UCiDKcjKocimAO1tV
1 examId 72975611-4a5e-11e5
1 postId 1189e340-b08f

2 channel_title Ab Live
2 youtube_id 3TNbtTwLY0U
2 channel_id UCODeKM_D6JLf8jJt
2 examId 72975611-4a5e-11e5
2 postId 0c3e6590-afeb

So, first you should run below query

SELECT 'SELECT id, ' + 
GROUP_CONCAT_UNQUOTED(
'MAX(IF(key = "' + key + '", value, NULL)) as [' + key + ']'
)
+ ' FROM yourTable GROUP BY id ORDER BY id'
FROM (
SELECT key
FROM yourTable
GROUP BY key
ORDER BY key
)

Result of above query will be string that (if to format) will look like below

SELECT 
id,
MAX(IF(key = "channel_id", value, NULL)) AS [channel_id],
MAX(IF(key = "channel_title", value, NULL)) AS [channel_title],
MAX(IF(key = "examId", value, NULL)) AS [examId],
MAX(IF(key = "postId", value, NULL)) AS [postId],
MAX(IF(key = "youtube_id", value, NULL)) AS [youtube_id]
FROM yourTable
GROUP BY id
ORDER BY id

you should now copy above result (note: you don't really need to format it - i did it for presenting only) and run it as normal query

Result will be as you would expected

id  channel_id          channel_title   examId              postId          youtube_id   
1 UCiDKcjKocimAO1tV Mahendra Guru 72975611-4a5e-11e5 1189e340-b08f ugEGMG4-MdA
2 UCODeKM_D6JLf8jJt Ab Live 72975611-4a5e-11e5 0c3e6590-afeb 3TNbtTwLY0U

Please note: you can skip Step 1 if you can construct proper query (as in step 2) by yourself and number of fields small and constant or if it is one time deal. But Step 1 just helper step that makes it for you, so you can create it fast any time!

If you are interested - you can see more about pivoting in my other posts.

How to scale Pivoting in BigQuery?

Please note – there is a limitation of 10K columns per table - so you are limited with 10K organizations.

You can also see below as simplified examples (if above one is too complex/verbose):

How to transpose rows to columns with large amount of the data in BigQuery/SQL?

How to create dummy variable columns for thousands of categories in Google BigQuery?

Pivot Repeated fields in BigQuery

Transpose rows into columns in BigQuery using standard sql

for BigQuery Standard SQL



#standardSQL
SELECT
uniqueid,
MAX(IF(order_of_pages = 1, page_flag, NULL)) AS p1,
MAX(IF(order_of_pages = 2, page_flag, NULL)) AS p2,
MAX(IF(order_of_pages = 3, page_flag, NULL)) AS p3,
MAX(IF(order_of_pages = 4, page_flag, NULL)) AS p4,
MAX(IF(order_of_pages = 5, page_flag, NULL)) AS p5
FROM `mytable`
GROUP BY uniqueid

You can play/test with below dummy data from your question

#standardSQL
WITH `mytable` AS (
SELECT 'A' AS uniqueid, 'Collection' AS page_flag, 1 AS order_of_pages UNION ALL
SELECT 'A', 'Product', 2 UNION ALL
SELECT 'A', 'Product', 3 UNION ALL
SELECT 'A', 'Login', 4 UNION ALL
SELECT 'A', 'Delivery', 5 UNION ALL
SELECT 'B', 'Clearance', 1 UNION ALL
SELECT 'B', 'Search', 2 UNION ALL
SELECT 'B', 'Product', 3 UNION ALL
SELECT 'C', 'Search', 1 UNION ALL
SELECT 'C', 'Collection', 2 UNION ALL
SELECT 'C', 'Product', 3
)
SELECT
uniqueid,
MAX(IF(order_of_pages = 1, page_flag, NULL)) AS p1,
MAX(IF(order_of_pages = 2, page_flag, NULL)) AS p2,
MAX(IF(order_of_pages = 3, page_flag, NULL)) AS p3,
MAX(IF(order_of_pages = 4, page_flag, NULL)) AS p4,
MAX(IF(order_of_pages = 5, page_flag, NULL)) AS p5
FROM `mytable`
GROUP BY uniqueid
ORDER BY uniqueid

result is

uniqueid    p1          p2          p3      p4      p5   
A Collection Product Product Login Delivery
B Clearance Search Product null null
C Search Collection Product null null

Depends on your needs you can also consider below approach (not pivot though)

#standardSQL
SELECT uniqueid,
STRING_AGG(page_flag, '>' ORDER BY order_of_pages) AS journey
FROM `mytable`
GROUP BY uniqueid
ORDER BY uniqueid

if to run with same dummy data as above - result is

uniqueid    journey  
A Collection>Product>Product>Login>Delivery
B Clearance>Search>Product
C Search>Collection>Product

How to convert or transpose rows to columns in SQL without using pivot?

Try STRING_AGG instead of any_value:

select id, string_agg(if(source_name = 'cp', value, null)) as cp,
string_agg(if(source_name = 'hi', value, null)) as hi,
string_agg(if(source_name = 'li', value, null)) as li
string_agg(if(source_name = 'mi', value, null)) as mi
from table_name group by id

Transpose rows into columns in Standard BigQuery (Pivot implementation)

In #standardSQL use CONCAT() instead of + to compose strings:

'MAX(IF(key = "'+key+'", value, NULL)) as ['+key+']'

->

CONCAT('MAX(IF(key = "',key,'", value, NULL)) as [',key,']')

BigQuery pivot multiple rows into columns

Consider below approach

select * from your_table
pivot (
any_value(ACount) as ACount,
any_value(BCount) as ACount
for replace(Date, '-', '_') in ('2021_12_07', '2021_12_08')
)

if applied to sample data in your question - output is

Hope, you can use it in your real use-case

Sample Image

Transpose Rows into column in bigquery

Consider below

select error, regexp_extract_all(error, r'\d{2}') codes
from data

if applied to sample data in your question - output is

Sample Image

convert rows into columns - Bigquery


I did check this post but unable to translate it?

Let's first transform your original data into form that we then can pivot

Below does this:

#standardSQL
SELECT subject_id, hm_id, icu_id, balance,
DATE_DIFF(day, MIN(day) OVER(PARTITION BY subject_id, hm_id, icu_id), DAY) + 1 delta
FROM `project.dataset.table`
-- ORDER BY subject_id, hm_id, icu_id, delta

If to apply to sample data from your question - result is

Row subject_id  hm_id   icu_id  balance delta    
1 124 ab cd 2 1
2 124 ab cd 5 2
3 321 xy pq -6 1
4 321 xy pq 1 2

So, now we need to pivot this based on delta column - balance for delta = 1 will go to day_1_balance, balance for delta = 2 will go to day_2_balance and so on

Let's for now assume that there are just two deltas (as in your sample data). In this simplified case - below will make a trick

#standardSQL
SELECT subject_id, hm_id, icu_id,
MAX(IF(delta = 1, balance, NULL)) day_1_balance,
MAX(IF(delta = 2, balance, NULL)) day_2_balance
FROM (
SELECT subject_id, hm_id, icu_id, balance,
DATE_DIFF(day, MIN(day) OVER(PARTITION BY subject_id, hm_id, icu_id), DAY) + 1 delta
FROM `project.dataset.table`
)
GROUP BY subject_id, hm_id, icu_id
-- ORDER BY subject_id, hm_id, icu_id

with result

Row subject_id  hm_id   icu_id  day_1_balance   day_2_balance    
1 124 ab cd 2 5
2 321 xy pq -6 1

Obviously, in real case you don't know how many delta columns you have so you need to build above query dynamically - and that is exactly where post you referenced - will help you

You can try again by yourself - or see below for final solution

Step 1 - generating query

#standardSQL
WITH temp AS (
SELECT subject_id, hm_id, icu_id, balance,
DATE_DIFF(day, MIN(day) OVER(PARTITION BY subject_id, hm_id, icu_id), DAY) + 1 delta
FROM `project.dataset.table`
)
SELECT CONCAT('SELECT subject_id, hm_id, icu_id,',
STRING_AGG(
CONCAT(' MAX(IF(delta = ',CAST(delta AS STRING),', balance, NULL)) as day_',CAST(delta AS STRING),'_balance')
)
,' FROM temp GROUP BY subject_id, hm_id, icu_id ORDER BY subject_id, hm_id, icu_id')
FROM (
SELECT delta
FROM temp
GROUP BY delta
ORDER BY delta
)

Result of step 1 is the text that represent final query that you need to run as step 2

Step 2 - run generated query

#standardSQL
WITH temp AS (
SELECT subject_id, hm_id, icu_id, balance,
DATE_DIFF(day, MIN(day) OVER(PARTITION BY subject_id, hm_id, icu_id), DAY) + 1 delta
FROM `project.dataset.table`
)
SELECT subject_id, hm_id, icu_id,
MAX(IF(delta = 1, balance, NULL)) AS day_1_balance,
MAX(IF(delta = 2, balance, NULL)) AS day_2_balance
FROM temp
GROUP BY subject_id, hm_id, icu_id
-- ORDER BY subject_id, hm_id, icu_id

String Aggregtion to create pivot columns in Big Query

Below example is for BigQuery Standard SQL

Assume you have data as below

#standardSQL
WITH `project.dataset.table` AS (
SELECT 1 id, 'channel_title' key, 'Mahendra Guru' value UNION ALL
SELECT 1, 'youtube_id', 'ugEGMG4-MdA' UNION ALL
SELECT 1, 'channel_id', 'UCiDKcjKocimAO1tV' UNION ALL
SELECT 1, 'examId', '72975611-4a5e-11e5' UNION ALL
SELECT 1, 'postId', '1189e340-b08f' UNION ALL

SELECT 2, 'channel_title', 'Ab Live' UNION ALL
SELECT 2, 'youtube_id', '3TNbtTwLY0U' UNION ALL
SELECT 2, 'channel_id', 'UCODeKM_D6JLf8jJt' UNION ALL
SELECT 2, 'examId', '72975611-4a5e-11e5' UNION ALL
SELECT 2, 'postId', '0c3e6590-afeb'
)

to get those keys pivoted - you can use below query

#standardSQL
SELECT id,
MAX(IF(key = "channel_id", value, NULL)) AS `channel_id`,
MAX(IF(key = "channel_title", value, NULL)) AS `channel_title`,
MAX(IF(key = "examId", value, NULL)) AS `examId`,
MAX(IF(key = "postId", value, NULL)) AS `postId`,
MAX(IF(key = "youtube_id", value, NULL)) AS `youtube_id`
FROM `project.dataset.table`
GROUP BY id

with result

Row id  channel_id          channel_title   examId              postId          youtube_id   
1 1 UCiDKcjKocimAO1tV Mahendra Guru 72975611-4a5e-11e5 1189e340-b08f ugEGMG4-MdA
2 2 UCODeKM_D6JLf8jJt Ab Live 72975611-4a5e-11e5 0c3e6590-afeb 3TNbtTwLY0U

to make your life simpler and not to type all those line for each and every key - you can instead generate the whole SQL text using below query

#standardSQL
SELECT 'SELECT id, ' ||
STRING_AGG(
'MAX(IF(key = "' || key || '", value, NULL)) as `' || key || '`'
)
|| ' FROM `project.dataset.table` GROUP BY id ORDER BY id'
FROM (
SELECT key
FROM `project.dataset.table`
GROUP BY key
ORDER BY key
)

Note: the post you mentioned in your question was given in 2016 and for Legacy SQL



Related Topics



Leave a reply



Submit