Bigquery If field exists
Let's assume your table has x and y fields only!
So below query will perfectly work
SELECT x, y FROM YourTable
But below one will fail because of non-existing field z
SELECT x, y, z FROM YourTable
The way to address this is as below
#legacySQL
SELECT x, y, COALESCE(z, 0) as z
FROM
(SELECT * FROM YourTable),
(SELECT true AS fake, NULL as z)
WHERE fake IS NULL
EDIT: added explicit
#legacySQL
to not to confuse those who is trying to apply this exact approach to Standard SQL :o)
IF Field Exists in StandardSQL
Below is for BigQuery Standard SQL
#standardSQL
SELECT * FROM `project.dataset.fruits`
WHERE EXISTS (
SELECT 1 FROM `project.dataset.fruits` t
WHERE REGEXP_CONTAINS(TO_JSON_STRING(t), '[{,]"peaches":')
LIMIT 1
)
BigQuery IF field exists THEN
Below should give you direction
SELECT * FROM
(SELECT * FROM <somewhere w/o my_field>),
(SELECT * FROM <somewhere with my_field>)
Assuming you have a, b and c as a fields in your original table () - above can be used (see below) if you need to change missing values from NULL to 0:
SELECT a, b, c, COALESCE(my_field, 0) as my_field
FROM
(SELECT * FROM <somewhere w/o my_field>),
(SELECT * FROM <somewhere with my_field>)
Select column value if column exists in that table else create that column and set it's value to null in BigQuery
I assume in the following that you have a source table (the one with potentially "missing" columns) and an existing target table (with the desired schema).
In order to get the information of the columns of these tables, you just need to look into the INFORMATION_SCHEMA.COLUMNS table.
The solution below uses dynamic SQL, to 1) generate the desired SQL, 2) run it.
DECLARE column_selection STRING;
SET column_selection = (
WITH column_table AS (
SELECT
source.column_name AS source_colum,
tgt.column_name AS target_column
FROM
(SELECT
column_name
FROM `<yourproject>.<target_dataset>.INFORMATION_SCHEMA.COLUMNS`
WHERE table_name='<target_table>') tgt
LEFT JOIN
(SELECT column_name
FROM `<yourproject>.<source_dataset>.INFORMATION_SCHEMA.COLUMNS`
WHERE table_name='<source_table>') source
ON source.column_name = tgt.column_name
)
SELECT STRING_AGG(coalesce(source_column,
CONCAT("NULL AS `",target_column, "`")), ", \n") AS col_selection
FROM
column_table
)
EXECUTE IMMEDIATE
FORMAT("SELECT %s FROM `<yourproject>.<source_dataset>.<source_table>`", column_selection) ;
Explanation of the steps
Build a
column_table
for the columns we want to query:a. first column containing the columns of the target table,
b. second one containing the corresponding source columns if they exist, orNULL
if they don'tOnce we have this table, we can build the desired
SELECT
statement: the name of the column is it's in the source table, or if it's NOT present, we want to have in our query " NULL AS `column_name_in_target` "
This is expressed in thecoalesce(source_column, CONCAT("NULL AS
``",target_column, "\``"))
We aggregate all these statement with STRING_AGG
into the desired column selection.
- Final step: putting together the rest of the query ( "SELECT" + <column_selection_string> + "FROM <your_source_table>" + ...), and we can
EXECUTE IMMEDIATE
it.
Filtering with exists in BigQuery
Use below instead
SELECT * FROM UNNEST([
STRUCT(NULL AS a, '' AS b),
(1, 'Alpha'),
(2, 'Bravo'),
(3, 'Charlie'),
(4, 'Delta')
])
WHERE (a,b) in UNNEST([
STRUCT(NULL AS a, '' AS b),
(1, 'Alpha')
])
with output
How to check if a value exists in an array type column using SQL?
Consider below approach
select format('%T', some_numbers) some_numbers,
(select count(1) > 0
from t.some_numbers number
where number in (3, 10)
) as exist
from sequences t
when applied to sample data in your question - output is
Note: I used format('%T', some_numbers)
just for the sake of formatting output of array - but you might use just some_numbers
instead
Related Topics
Using a Single Row Configuration Table in SQL Server Database. Bad Idea
Are Databases and Functional Programming at Odds
Query for Searching the Name Alphabetically
How to Concatenate More Than Two Columns in Plsql Developer
Setting Variables in SQL Functions/Probs
How to Find Which Columns Don't Have Any Data (All Values Are Null)
Oracle SQL: Variables Used in Place of Table Names
How to Remove Part of the String in Oracle
How to Find All Open/Active Connections in Db2 (8.X)
Sparksql Error Table Not Found
Calculating SQL Server Row_Number() Over() for a Derived Table
Problem with MySQL Insert Max()+1
How to Write the Equivalent SQL Case Statement for Query Given Below
How to Add Sequenced Number Based on Sorted Value in Query in Access