Trouble with joining tables on BigQuery
It looks like your stats
table has mutiple rows for the same ExternalCustomerId
(and it can be understandable for example if it is partitioned and you have different data during the days).
Try to explore a little bit more as it follows:
SELECT count(*) as total, count(distinct ExternalCustomerId) as uniques
FROM `298114322003.google_ads1.p_AccountStats_2670156874`
If you have duplicated ExternalCustomerId
then every client row will be multiplied by the number of matching rows in the stats
table.
How to filter multiple conditions in BigQuery?
The subquery you have created cannot be used in the WHERE clause.
One approach would be to use a UNION
to combine two sets of results.
This should work:
(SELECT
TIMESTAMP_DIFF(ended_at,started_at,SECOND) AS ride_length,
start_station_name,
end_station_name
FROM
divvy_stations_trips.all_trips
WHERE
TIMESTAMP_DIFF(ended_at,started_at,SECOND) > 31 AND
start_station_name != end_station_name)
UNION ALL
(SELECT
TIMESTAMP_DIFF(ended_at,started_at,SECOND) AS ride_length,
start_station_name,
end_station_name
FROM
divvy_stations_trips.all_trips
WHERE
TIMESTAMP_DIFF(ended_at,started_at,SECOND) > 60 AND
start_station_name = end_station_name)
Dynamic conditional joins in BigQuery
Below is for BigQuery Standard SQL and assumes tables to be set as below (with agreement from the OP in the comments to question)
identification_table
SELECT 'Id 1' id, 'masterProductId' subject, '=' operator, '1007' value UNION ALL
SELECT 'Id 1', 'brandName', '=', 'brand p' UNION ALL
SELECT 'Id 1', 'categoryName', '=', 'category 1' UNION ALL
SELECT 'Id 2', 'categoryName', '=', 'category 1' UNION ALL
SELECT 'Id 2', 'price', '<', '130' UNION ALL
SELECT 'Id 3', 'categoryName', '=', 'category 3'
filing_table
SELECT 11 code, 'category 1' categoryName, 'brand p' brandName, 1001 masterProductId UNION ALL
SELECT 22, 'category 1', 'brand z', 1002 UNION ALL
SELECT 33, 'category 2', 'brand c', 1003 UNION ALL
SELECT 44, 'category 2', 'brand v', 1004 UNION ALL
SELECT 55, 'category 3', 'brand e', 1005
price
SELECT 11 code, 3 price UNION ALL
SELECT 22, 100 UNION ALL
SELECT 33, 8 UNION ALL
SELECT 44, 9 UNION ALL
SELECT 77, 28
So, below extracts from filing_table
those masterProductId
's which qualify based on all criteria from identification_table
EXECUTE IMMEDIATE '''
SELECT masterProductId
FROM (
SELECT f.*, price
FROM `filing_table` f
LEFT JOIN `price` p
USING(code)
)
WHERE ''' || (
SELECT STRING_AGG('(' || condition || ')', ' OR ')
FROM (
SELECT STRING_AGG(FORMAT('(%s %s %s)', subject, operator, value), ' AND ') condition
FROM `identification_table`,
UNNEST([IF(subject IN ('price', 'masterProductId'), value, '"' || value || '"')]) value
GROUP BY id
));
If to apply to sample data as in top of the answer - output is
Row masterProductId
1 1001
2 1002
3 1005
Joining multiple tables in bigquery
I think you are looking for something like below
SELECT
t1.field1 AS field1,
t2.field2 AS field2,
t1.field3 AS field3,
t3.field4 AS field4
FROM [datasetName.tableA] t1
JOIN [datasetName.tableB] t2 ON t1.somefield = t2.anotherfield
JOIN [datasetName.tableC] t3 ON t1.somefield = t3.yetanotherfield
BigQuery referencing subquery under WITH clause in WHERE clause
Firstly a note.
You probably don't want to do this.
CTEs (the WITH query) is a bit counterintuitive for people who normally code because it feels like a variable but it is not.
What actually happens is that you run the query many times to evaluate it, resulting in poor performance and extra $$ spent.
I recommend you replace this for a simple JOIN, it will achieve the same thing and generally be just way better.
Basically your query would be like:
WITH list_of_ids AS (
SELECT id FROM table_with_ids
)
SELECT main_table.*
FROM main_table
JOIN list_of_ids
ON main_table.id = list_of_ids.id
I think it is pretty clean syntax and solves your problem.
Let me know if there is something I am missing and I can add to this.
Related Topics
SQL Query Where Date = Today Minus 7 Days
SQL Aggregate Function to Obtain a List
Query Last N Related Rows Per Row
Re-Use Aliased Field in SQL Select Statement
Escaping Strings Containing Single Quotes in Powershell Ready for SQL Query
Strip Non-Numeric Characters from a String
Dynamic Pivot Table with Multiple Columns in SQL Server
Trouble Making a Running Sum in Access Query
SQL Server Cte Parent Child Recursive
Table as an Argument of a Postgresql Function
How to Best Organize the Inner Joins in (Select) Statement
MySQL - How to Order Results by Alternating (1,2,3, 1, 2, 3, 1, 2, 3,) Rows, Is It Possible
SQL - Safely Downcast Bigint to Int
Ms Access: How to Count Distinct Value Using Access Query
SQL Server Foreign Key to Multiple Tables