Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

Detect schema change in BigQuery Tables

Hello Googlers,

I'm looking for a metadata field that shows when BigQuery table schema was last changed.

I've tried INFORMATION_SCHEMA.TABLES and __TABLES__ but they don't give me what I'm looking for.  INFORMATION_SCHEMA.TABLES has a creation_time field that only gets set when you create the table...not updated on ALTER.  __TABLES__ has a last_modified_time field which gets set/updated any time you create or alter a table but it also gets updated whenever you insert data into the table.

I haven't found any other promising fields in documentation or this forum.  Does anyone know if what I'm looking for can be found?

Here's a sample script showing what I've tried:

 

 

DROP TABLE IF EXISTS your_project.your_dataset.your_table;

SELECT CURRENT_TIMESTAMP();
--2024-02-28 19:48:32.661



SELECT table_id, TIMESTAMP_MILLIS(last_modified_time) as fmt_last_modified_time, last_modified_time
from your_project.your_dataset.__TABLES__
where table_id = 'your_table' ;
-- No data

CREATE TABLE your_project.your_dataset.your_table (LastName STRING(100), FirstName STRING(100));

SELECT table_id, TIMESTAMP_MILLIS(last_modified_time) as fmt_last_modified_time, last_modified_time
from your_project.your_dataset.__TABLES__
where table_id = 'your_table' ;
-- your_table 2024-02-28 19:48:54.487 1709174934487

SELECT table_name, creation_time
FROM your_project.your_dataset.INFORMATION_SCHEMA.TABLES
WHERE table_type = 'BASE TABLE' AND table_name = 'test_metadata';
-- your_table 2024-02-28 19:48:54.487

ALTER TABLE your_project.your_dataset.your_table ADD COLUMN MiddleName STRING(100);

SELECT table_id, TIMESTAMP_MILLIS(last_modified_time) as fmt_last_modified_time, last_modified_time
from your_project.your_dataset.__TABLES__
where table_id = 'your_table' ;
-- your_table 2024-02-28 19:49:54.296 1709174994296
-- changed, GOOD!

SELECT table_name, creation_time
FROM your_project.your_dataset.INFORMATION_SCHEMA.TABLES
WHERE table_type = 'BASE TABLE' AND table_name = 'your_table';
-- your_table 2024-02-28 19:48:54.487
-- no change, BAD!

INSERT INTO your_project.your_dataset.your_table (LastName, FirstName, MiddleName) VALUES ('Krigbaum', 'Michael', 'James');

select * from your_project.your_dataset.your_table;
--Krigbaum Michael James

SELECT table_id, TIMESTAMP_MILLIS(last_modified_time) as fmt_last_modified_time, last_modified_time
from your_project.your_dataset.__TABLES__
where table_id = 'your_table' ;
-- your_table 2024-02-28 19:51:48.009 1709175108009
-- changed, BAD!

 

0 2 1,863
2 REPLIES 2

It may be that the metadata you are looking for is simply not stored/recorded in the INFORMATION_SCHEMA tables.  INFORMATION_SCHEMA is primarily used to provide real-time interrogation of the current state of your BigQuery environment rather than providing historic state of "what was present in the past".  The later concept is likely what we would consider an "audit log".  BigQuery, like most of the Google Cloud services, also records a rich audit trail when its state changes.  What I'd suggest you do is examine the BigQuery audit logs that I believe you will find in Cloud Logging.  Make some changes to your tables and see if corresponding audit log records are written to Cloud Logging.  If yes, then you can also direct those audit records to be appended to a BigQuery table such that you can query the table of audit logs and, effectively, get the information you are looking for as part of a query.

@kolban Thank you for the great information in your response.  I've attempted to build a Cloud Logging query to capture table create/alter/drop events but it doesn't capture known instances.

protoPayload.serviceName="bigquery.googleapis.com"
resource.type="bigquery_dataset"
protoPayload.resourceName="projects/prj-dev-ebi-healthera/datasets/healthmart/tables/test_metadata"
protoPayload.metadata.@type="type.googleapis.com/google.cloud.audit.BigQueryAuditMetadata"
--protoPayload.metadata.tableDeletion:* or protoPayload.metadata.tableChange:* or protoPayload.metadata.tableCreation:*

With line 5 commented, the above query returns 3 events that should qualify for the conditions in line 5 but when line 5 is uncommented, the query doesn't return any results.  I'm sure that I just don't have the syntax correct...could you (or somebody else in this community) help me out