How to Update/Delete Rows in Bigquery from the Python API

Run delete query from python in BigQuery

I think below code snippet should work for you. You should pass query_delete instead of query

from google.cloud import bigquery

bigquery_client = bigquery.Client(project='my-project')

query_delete = f"""
delete from `my_table` where created_at >= '2021-01-01'
"""

print(query_delete)
job = bigquery_client.query(query_delete)
job.result()
print("Deleted!")

Or you can try below formatted query

query_delete = (
"DELETE from my_table "
"WHERE created_at >= '2021-01-01'"
)

Google BigQuery Delete Rows?

2016 update: BigQuery can delete and update rows now -- Fh

https://cloud.google.com/bigquery/docs/reference/standard-sql/dml-syntax


Thanks for describing your use case. BigQuery is append-only by design. We currently don't support deleting single rows or a batch of rows from an existing dataset.

Currently, to implement a "rotating" log system you must either:
1. Create a new table each day (and delete older tables if that is necessary)
2. Append your data to a table and query by time/date

I would actually recommend creating a new table for each day. Since BigQuery charges by amount of data queried over, this would be most economical for you, rather than having to query over entire massive datasets every time.

By the way - how are you currently collecting your data?

Delete bulk rows in Big Query table based on list of unique ids

As @blackbishop mention, you can try to upgrade requests to the latest version (in my case it solve the problem) but since I tried to update bulk rows (let's say 500.000+ row in Big Query, which every row have a unique id), turns out it gives me a timeout with small machine type Dataproc clusters that I used (if someone has a resource to try with better Dataproc cluster and success please feel free to edit this answer).

So I go with the Merge statement as documented here:
https://cloud.google.com/bigquery/docs/reference/standard-sql/dml-syntax#merge_statement

The script would be like this to update existing rows (assuming I have the new data I retrieve and already load it to Staging_Sample_Dataset.Staging_Sample_Table):

def merge_data():
client = bigquery.Client()
query = """MERGE INTO Sample_Dataset.Sample_Table st
USING Staging_Sample_Dataset.Staging_Sample_Table ss
ON st.Sample_Table_Id = ss.Sample_Table_Id
WHEN MATCHED UPDATE SET st.Your_Column = ss.Your_Column -- and the list goes on...
WHEN NOT MATCHED THEN INSERT ROW
"""
query_job = client.query(query, location="asia-southeast2")
results = query_job.result()

or I could delete bulk rows and call another function to load after this function executed:

def bulk_delete():
client = bigquery.Client()
query = """MERGE INTO Sample_Dataset.Sample_Table st
USING Staging_Sample_Dataset.Staging_Sample_Table sst
ON st.Sample_Table_Id = sst.Sample_Table_Id
WHEN MATCHED THEN DELETE
"""
query_job = client.query(query, location="asia-southeast2")
results = query_job.result()

BigQuery update how to get number of updated rows

I think that you are searching for QueryJob.num_dml_affected_rows. It contain number of rows affected by update or any other DML statement. If you just paste it to your code instead of rows in return statement you will get number as int or you can create some massage like :

return("Number of updated rows: " + str(job_query.num_dml_affected_rows))

I hope it will help :)

How to delete rows in a bigquery table using golang

If you want to delete rows, you issue a DML DELETE. More details about the DELETE statement can be found here: https://cloud.google.com/bigquery/docs/reference/standard-sql/dml-syntax#delete_statement

If you want an example of issuing a query, you can see the examples embedded elsewhere in the BigQuery documentation, or just go directly to source code snippets e.g. https://github.com/GoogleCloudPlatform/golang-samples/blob/master/bigquery/snippets/querying/bigquery_query.go

If that's not sufficient, please clarify your question to provide more details about the issue you're experiencing (for example, the schema of your table and what you're trying to delete).

BigQuery: Questions on Delete and Update rows using nodejs

Building on the sync_query example:

async function runDeleteQuery(projectId) {
// Imports the Google Cloud client library
const BigQuery = require('@google-cloud/bigquery');

/**
* TODO(developer): Uncomment the following line before running the sample.
*/
// const projectId = "your-project-id";

// Modify this query however you need
const sqlQuery = "DELETE FROM dataset.table WHERE condition;";

// Creates a client
const bigquery = new BigQuery({projectId});

// Query options list: https://cloud.google.com/bigquery/docs/reference/v2/jobs/query
const options = {
query: sqlQuery,
timeoutMs: 100000, // Time out after 100 seconds.
useLegacySql: false, // Use standard SQL syntax for queries.
};

// Runs the query
await bigquery.query(options);
}

See also the DELETE statement documentation.

I am using google bigquery and i want to delete rows having timestamp after particuler time period

Usually it takes no more than 90 minutes for the data to become available for copy/delete operations. Just wait a bit.



Related Topics



Leave a reply



Submit