Support Union Function in Bigquery SQL

Support UNION function in BigQuery SQL

If you want UNION so that you can combine query results, you can use subselects
in BigQuery:

SELECT foo, bar 
FROM
(SELECT integer(id) AS foo, string(title) AS bar
FROM publicdata:samples.wikipedia limit 10),
(SELECT integer(year) AS foo, string(state) AS bar
FROM publicdata:samples.natality limit 10);

This is almost exactly equivalent to the SQL

SELECT id AS foo, title AS bar 
FROM publicdata:samples.wikipedia limit 10
UNION ALL
SELECT year AS foo, state AS bar
FROM publicdata:samples.natality limit 10;

(note that if want SQL UNION and not UNION ALL this won't work)

Alternately, you could run two queries and append the result.

How to use ST_UNION in BigQuery

You should use ST_UNION_AGG instead of ST_UNION

ST_UNION is to make a union horizontally in your table: when you have a column with an array of geography object that you want to transform into a single one, or two columns of geography objects that you want to merge into two.
At the end of the operation, your table has the same number of rows.

ST_UNION_AGG is to make a union vertically: you have one column of geography objects that you want to aggregate into a single one (perhaps per group..)
At the end of the operation, your rows have been aggregated into only one row (or the number of groups, if you have a GROUP BY)

Merging tables in Google BigQuery with UNION ALL

The columns have to be the same, so something like this:

SELECT YYYYMMDDHH, CONTAINER, Parent_Container, PROTOTYPE_ID,
Withdrawal_this_hour, NULL as Refill_this_hour,
NULL as changes_this_hour, NULL as net_amount, NULL as date
FROM `tb1`
UNION ALL
SELECT YYYYMMDDHH, CONTAINER, Parent_Container, PROTOTYPE_ID,
NULL, Refill_this_hour, NULL, NULL, NULL
FROM `tb2`
UNION ALL
SELECT YYYYMMDDHH, CONTAINER, Parent_Container, PROTOTYPE_ID,
NULL, NULL, changes_this_hour, net_amount, date
FROM `tb3`

how to union two tables that starts with temporary table logic BigQuery SQL

I would think this would be close... Need more detail to the nature of your problem or an ability to recreate it to isolate the issue...

with origin_table as (
SELECT
date,
(
SELECT
value
FROM
UNNEST(hits.customDimensions)
WHERE
INDEX = 10
) AS second_scroll,
(
SELECT
value
FROM
UNNEST(hits.customDimensions)
WHERE
INDEX = 11
) AS dwell
FROM
(
SELECT
date,
hits,
FROM
`table_1`,
UNNEST(hits) AS hits
)
GROUP BY
1), --added the comma for 2nd cte.

origin_table2 as ( --Begin 2nd CTE
SELECT
date,
(
SELECT
value
FROM
UNNEST(hits.customDimensions)
WHERE
INDEX = 10
) AS second_scroll,
(
SELECT
value
FROM
UNNEST(hits.customDimensions)
WHERE
INDEX = 11
) AS dwell
FROM
(
SELECT
date,
hits,
FROM
`table_2`, --changed to table 2
UNNEST(hits) AS hits
)
GROUP BY
1)

select
date,
case
when second_scroll is not null
AND dwell is not null then 1
when second_scroll is null
AND dwell is not null then 0
when second_scroll is not null
AND dwell is null then 0
end as ENGAGEMENT
from
origin_table
UNION ALL --here's the union
select
date,
case
when second_scroll is not null
AND dwell is not null then 1
when second_scroll is null
AND dwell is not null then 0
when second_scroll is not null
AND dwell is null then 0
end as ENGAGEMENT
from
origin_table2 --and selecting from 2nd CTE to union...

Outer Union equivalent - GBQ

Looks like you are looking for below

select a, b, d 
from `project.dataset.table1`
full outer join `project.dataset.table2`
using(b)

if applied to sample data in your question - output is

Sample Image

You can avoid to specify all columns - like in below example (but you will not control order of columns in this case)

select * 
from `project.dataset.table1`
full outer join `project.dataset.table2`
using(b)

In case if you need to preserve order - see below

select t1.*, t2.* except(b)
from `project.dataset.table1` t1
full outer join `project.dataset.table2` t2
using(b)

In case if you really need union - you can use below

select a, b, null as d from `project.dataset.table1`
union all
select null as a, b, d from `project.dataset.table2`

SQL Self Union Naming?


Unfortunately I'm restricted to legacy SQL for a couple of other operations.

You can create View named query1 - so you will reference it as [project:dataset.query1] - Make sure you create it in Lagacy mode so you can then use it from query in legacy mode

So, now your query will be exactly (almost) as you asked - I'm trying to do something like the following (with regards to the references to A)

SELECT * FROM 
[project:dataset.query1],
(SELECT
'Category' AS Hier_Level,
MAX(Department) Department,
Category,
'Various' AS Subcategory,
SUM(Dollars) AS Dollars
FROM [project:dataset.query1]
GROUP BY Category),
(SELECT
'Department' AS Hier_Level,
Department,
'Various' AS Category,
'Various' AS Subcategory,
SUM(Dollars) AS Dollars
FROM [project:dataset.query1]
GROUP BY Department)

Having your example - the result will be as expected

Row Hier_Level  Department  Category    Subcategory Dollars  
1 Subcategory Electronics TV LCD 3500
2 Subcategory Electronics TV OLED 6000
3 Subcategory Electronics Phone iPhone 600
4 Category Electronics TV Various 9500
5 Category Electronics Phone Various 600
6 Department Electronics Various Various 10100


Related Topics



Leave a reply



Submit