BigQuery Date-Partitioned Views
Define your view to expose the partitioning pseudocolumn, like this:
SELECT *, EXTRACT(DATE FROM _PARTITIONTIME) AS date
FROM Date partitioned table;
Now if you query the view using a filter on date
, it will restrict the partitions that are read.
Do views of tables in BigQuery benefit from partitioning/clustering optimization?
If you're talking about a logical view, then yes if the base table it references is clustered/partitioned it will use those features if they're referenced from the WHERE clause. The logical view doesn't have its own managed storage, it's just effectively a SQL subquery that gets run whenever the view is referenced.
If you're talking about a materialized view, then partitioning/clustering from the base table isn't inherited, but can be defined on the materialized view. See the DDL syntax for more details: https://cloud.google.com/bigquery/docs/reference/standard-sql/data-definition-language#create_materialized_view_statement
Query a view which is created from partitioned table in bigquery
If you want the partition time in the view, you need to include it explicitly:
SELECT c.*, _PARTITIONTIME as pt
FROM `customers`
WHERE DATE(_PARTITIONTIME) > '2021-05-10'
Can I efficiently GROUP BY over a date partitioned table in BigQuery
Found the answer, this does the job:
SELECT table_name, partition_id, total_rows
FROM `p.d.INFORMATION_SCHEMA.PARTITIONS`
WHERE partition_id IS NOT NULL
and table_name = 't'
order by partition_id desc
it returns quickly and, of course, queries much less data.
Query complete (1.7 sec elapsed, 10 MB processed)
UPDATE statement in BigQuery that sets _PARTITIONDATE equal particular date field in your table
Creating a partitioned table
Since you don't need a table partition by ingestion time, you can create your table using your own date field as the partition field. You can do so by adding the "PARTITON BY" statement when creating a table, like this
CREATE TABLE `project_id.mydataset.mytable` (
field1 STRING,
dt TIMESTAMP
)
PARTITION BY DATE(dt)
or
CREATE TABLE `project_id.mydataset.mytable`
PARTITION BY DATE(dt)
AS (
SELECT * FROM `project_id.mydataset.othertable`
)
Updating the _PARTITIONTIME
Addressing your original question, if you need you can also update the _PARTITIONTIME field. To set all _PARTITIONTIME fields equal to your dt column, you can do the following:
UPDATE
project_id.dataset.mytable
SET
_PARTITIONTIME = dt
WHERE
1=1
If dt has a different granularity than _PARTITIONTIME (_PARTITIONTIME granularity is day and dt is hour, for example), than you can do a TIMESTAMP_TRUNC
UPDATE
project_id.dataset.mytable
SET
_PARTITIONTIME = TIMESTAMP_TRUNCT(dt, DAY)
WHERE
1=1
Related Topics
Translating SQL Joins on Foreign Keys to R Data.Table Syntax
How to Quickly Export Data from R to SQL Server
What's the R Equivalent of SQL's Like 'Description%' Statement
Rodbc SQLsave Table Creation Problems
R: [Unixodbc][Driver Manager]Can't Open Lib 'SQL Server':File Not Found
How to Join Two Tables But Only Return Rows That Don't Match
How to Get the Value of Autoincrement of Last Row at the Insert
How to Use Update Trigger to Update Another Table
Performance of SQL "Exists" Usage Variants
Create Nested JSON from SQL Query Postgres 9.4
Moving Average Based on Timestamps in Postgresql
Sql: How to Fill Empty Cells with Previous Row Value
How to Run SQL Statements on a Named Range Within an Excel Sheet
How to Insert New Row to Database with Auto_Increment Column Without Specifying Column Names