Get The Last Modified Date for All Bigquery Tables in a Bigquery Project

Get the Last Modified date for all BigQuery tables in a BigQuery Project

For a SQL command you could try this one:

#standardSQL
SELECT *, TIMESTAMP_MILLIS(last_modified_time)
FROM `dataset.__TABLES__` where table_id = 'table_id'

I recommend you though to see if you can log these errors at the application level. By doing so you can also understand why something didn't work as expected.

If you are already using GCP you can make use of Stackdriver (it works on AWS as well), we started using it in our projects and I recommend giving it a try (we tested for python applications though, not sure how the tool performs on other clients but it might be quite similar).

get the last modified date of tables using bigquery tables GET api

Not quite sure which version of the API you are using but I suspect the latest versions do not have the method dataset.list_tables().

Still, this is one way of getting last modified field, see if this works for you (or gives you some idea on how to get this data):

from google.cloud import bigquery
client = bigquery.Client.from_service_account_json('/key.json')

dataset_list = list(client.list_datasets())
for dataset_item in dataset_list:
dataset = client.get_dataset(dataset_item.reference)
tables_list = list(client.list_tables(dataset))

for table_item in tables_list:
table = client.get_table(table_item.reference)
print "Table {} last modified: {}".format(
table.table_id, table.modified)

Google Big Query - How to get Last Updated Dates of all tables in a Dataset using SQL

Try below

If I correctly understood the question - it has what you asked and many more


#standardSQL
SELECT table_id,
DATE(TIMESTAMP_MILLIS(creation_time)) AS creation_date,
DATE(TIMESTAMP_MILLIS(last_modified_time)) AS last_modified_date,
row_count,
size_bytes,
CASE
WHEN type = 1 THEN 'table'
WHEN type = 2 THEN 'view'
WHEN type = 3 THEN 'external'
ELSE '?'
END AS type,
TIMESTAMP_MILLIS(creation_time) AS creation_time,
TIMESTAMP_MILLIS(last_modified_time) AS last_modified_time,
dataset_id,
project_id
FROM `myproject.mydataset.__TABLES__`

BigQuery - query for metadata like 'last modified'

You can view the metadata by exporting BigQuery usage logs.

Setup logs export using Cloud Logging. See creating a sink for a more detailed take.

  1. Open Logging
  2. Click Logs Router
  3. Click Create Sink
  4. Enter "Sink Name"
  5. For "Sink service" choose "BigQuery dataset"
  6. Select your BigQuery dataset to monitor
  7. Create sink

When the sink is created, all operations to be executed will store data usage logs in table "cloudaudit_googleapis_com_data_access_YYYYMMDD" and activity logs in table "cloudaudit_googleapis_com_activity_YYYYMMDD" under the BigQuery dataset you selected in your sink. Keep in mind that you can only track the usage starting at the date when you set up the logs export tables.

Created logging tables:

Sample Image

Using this simple query I checked the latest update date of my table:

SELECT  
JSON_EXTRACT(protopayload_auditlog.metadataJson, "$.tableChange.table.updateTime") as lastUpdate
FROM `project_id.dataset.cloudaudit_googleapis_com_activity*`
where protopayload_auditlog.metadataJson IS NOT NULL
ORDER BY lastUpdate DESC LIMIT 1

Query Output:

Sample Image

See more BigQuery logging query examples and you can refer to the metadata json structure to have an idea on what details you can pull on your tables.


EDIT 20210923:

Another option is to query __TABLES__ it will return:

  • project_id
  • dataset_id
  • table_id
  • creation_time
  • last_modified_time
  • row_count
  • size_bytes

See query:

SELECT * FROM `projec_id.dataset_id.__TABLES__` LIMIT 100

Output:
Sample Image

How do I get the last update time of a sequence of tables in BigQuery?

SELECT *
FROM project_name.data_set_name.INFORMATION_SCHEMA.PARTITIONS
where table_name='my_table';

Solution for Google

Executing a BigQuery using google workflow to get last modified of a table. getting wrong results in workflow but same works fine in BIGQUERY UI

The result in your workflow is right. A timestamp is an integer number. In BQ UI, the timestamp is converted to the DateTime format. You can convert the last_modified_time timestamp value to the DateTime format as you want.

I use https://www.epochconverter.com/ to convert your timestamp result to the DateTime format and here is the result: GMT: Monday, July 5, 2021 10:35:29.263.

how to Get last Transaction For Ticket based on time stamp in Bigquery

Try query below:

with sample_data as (
select struct(111 as ticket_id, ['Open','ReOpen','Modified','Cancelled'] as type, [timestamp('2022-7-14 03:39:00'),timestamp('2022-7-14 03:40:00'),timestamp('2022-7-14 03:50:00'),timestamp('2022-7-14 04:39:00')] as time_stamp) as ticket,
union all select struct(122 as ticket_id, ['Open','ReOpen','Modified'] as type, [timestamp('2022-7-14 07:39:00'),timestamp('2022-7-14 07:40:00'),timestamp('2022-7-14 07:50:00')] as time_stamp) as ticket
),

converted_timestamp as (
select
ticket.ticket_id,
ticket.type,
array(
select
format_timestamp("%m/%d/%y %H:%M",stamp)
from unnest(ticket.time_stamp) as stamp
order by stamp) as formated
from sample_data
),
---------
-- disregard query above as it just generates the sample data
---------
processed_data as (
select
ticket_id,
last_value(t) over (partition by ticket_id order by f) as latest_type,
max(f) over (partition by ticket_id) as latest_time
from converted_timestamp,
unnest(formated) as f,
unnest(type) as t
)

select distinct ticket_id, latest_type,latest_time from processed_data

Output:

Sample Image



Related Topics



Leave a reply



Submit