Hello!
I'm currently working on setting up a sink for BigQuery Audit Logs, and I'm wondering if it's possible to configure a single sink to collect audit logs from either entire organizations or multiple projects into a single BigQuery table. Is there a way to achieve this, or do I need to create separate sinks for each project individually? I appreciate your assistance in addressing my query!
Solved! Go to Solution.
Hi @ylee1212 ,
Aggregated Log Sink Definition: An aggregated log sink is a sink that collects logs from multiple resources, such as projects, folders, or an entire organization. It is created at a higher level (organization or folder) with the --include-children
flag (if using gcloud).
Console Steps: To create an aggregated log sink to collect BigQuery audit logs from multiple projects or an entire organization in the Google Cloud Console, follow these steps:
Go to the Google Cloud console: https://console.cloud.google.com/.
Click the hamburger menu (three horizontal lines) in the top left corner of the page.
Select Logging.
Click the Sinks tab.
Click the Create sink button.
In the Create sink dialog box, enter the following information:
Under Filter, select Create a new filter.
In the Filter expression field, enter the following filter expression to collect all BigQuery audit logs:
resource.type="bigquery_resource" AND protoPayload.serviceName="bigquery.googleapis.com"
Service Account Permissions: After you have created the sink, you will need to grant the sink service account permission to write to the BigQuery table. To do this, follow these steps:
BigQuery Data Writer
.BigQuery Data Writer
role.Once you have granted the sink service account permission to write to the BigQuery table, it will start collecting audit logs. You can then query the BigQuery table to view your audit logs.
Hi @ylee1212 ,
Thank you for the additional information. The presence or absence of the protopayload_auditlog.metadataJson
field in audit logs should not inherently differ based on whether the logs are collected at the project or organization level. Both levels should capture the same fields for the same types of events.
If you're observing a discrepancy, it might be due to the specific types of logs being captured, the filters applied, or other configuration differences between the two sinks. The metadataJson
field provides additional metadata for certain types of audit log entries, particularly data access logs.
To troubleshoot, ensure both sinks are capturing the same types of logs and have consistent configurations.
Hi @ylee1212 ,
Aggregated Log Sink Definition: An aggregated log sink is a sink that collects logs from multiple resources, such as projects, folders, or an entire organization. It is created at a higher level (organization or folder) with the --include-children
flag (if using gcloud).
Console Steps: To create an aggregated log sink to collect BigQuery audit logs from multiple projects or an entire organization in the Google Cloud Console, follow these steps:
Go to the Google Cloud console: https://console.cloud.google.com/.
Click the hamburger menu (three horizontal lines) in the top left corner of the page.
Select Logging.
Click the Sinks tab.
Click the Create sink button.
In the Create sink dialog box, enter the following information:
Under Filter, select Create a new filter.
In the Filter expression field, enter the following filter expression to collect all BigQuery audit logs:
resource.type="bigquery_resource" AND protoPayload.serviceName="bigquery.googleapis.com"
Service Account Permissions: After you have created the sink, you will need to grant the sink service account permission to write to the BigQuery table. To do this, follow these steps:
BigQuery Data Writer
.BigQuery Data Writer
role.Once you have granted the sink service account permission to write to the BigQuery table, it will start collecting audit logs. You can then query the BigQuery table to view your audit logs.
Different Sink Output Schema Between Project and Organization Level Audit Logs
Firstly, I'd like to extend my gratitude to @ms4446 for the comprehensive response provided earlier.
To give some context, I recently set up a sink using Terraform following the guidance you've shared. During the implementation, I consulted a Google blog post and noticed some disparities in the output schema of our sinks.
For the project level sink, I can query the field protopayload_auditlog.metadataJson. However, this particular field seems to be absent from our organization level schema. Here are the snapshots for reference:
Organization BigQuery Audit Log Sink Snapshot:
Project Level Sink Snapshot:
Given the absence of the metadataJson field, I'm finding it challenging to execute the following query from the aforementioned blog:
SELECT
timestamp AS time_of_access,
protopayload_auditlog.authenticationInfo.principalEmail as user_email,
protopayload_auditlog.requestMetadata.callerIp as ip,
auth.permission as auth_permission,
auth.granted as auth_granted,
data_access.resource.labels.project_id AS job_execution_project,
SPLIT(protopayload_auditlog.resourceName, '/')[SAFE_OFFSET(1)] AS referenced_project,
SPLIT(protopayload_auditlog.resourceName, '/')[SAFE_OFFSET(3)] AS referenced_dataset,
SPLIT(protopayload_auditlog.resourceName, '/')[SAFE_OFFSET(5)] AS referenced_table, ARRAY_LENGTH(SPLIT(JSON_EXTRACT(JSON_EXTRACT(protopayload_auditlog.metadataJson, '$.tableDataRead'), '$.fields'), ',')) as num_fields,
SPLIT(JSON_EXTRACT(JSON_EXTRACT(protopayload_auditlog.metadataJson, '$.tableDataRead'), '$.fields'),",") as fields
FROM `my-project`.my_dataset.cloudaudit_googleapis_com_data_access As data_access, UNNEST(protopayload_auditlog.authorizationInfo) AS auth
WHERE
protopayload_auditlog.methodName = "google.cloud.bigquery.v2.JobService.InsertJob"
AND data_access.resource.type = 'bigquery_dataset'
AND JSON_EXTRACT(JSON_EXTRACT(protopayload_auditlog.metadataJson, '$.tableDataRead'), '$.reason') = '"JOB"';
I'm curious to understand if there's an inherent difference between the project level and organization level audit logs. Why might we be witnessing these inconsistencies?
Hi @ylee1212 ,
Thank you for the additional information. The presence or absence of the protopayload_auditlog.metadataJson
field in audit logs should not inherently differ based on whether the logs are collected at the project or organization level. Both levels should capture the same fields for the same types of events.
If you're observing a discrepancy, it might be due to the specific types of logs being captured, the filters applied, or other configuration differences between the two sinks. The metadataJson
field provides additional metadata for certain types of audit log entries, particularly data access logs.
To troubleshoot, ensure both sinks are capturing the same types of logs and have consistent configurations.
Happy Friday @ms4446 !
It seems like `resource.type="bigquery_resource"` filter that is removing the newer version of the log bigqueryAuditMetadata. Thanks for your advise! I am removing this field and about to apply the filter:
protoPayload.serviceName="bigquery.googleapis.com"
AND protoPayload.metadata."@type"="type.googleapis.com/google.cloud.audit.BigQueryAuditMetadata"
so we get the only newer version of the log.
Yes, Happy Friday @ylee1212 😊
That's right. Using the resource.type="bigquery_resource"
filter might exclude the newer version of the BigQuery audit log, which is represented by the bigqueryAuditMetadata
field.
The filter you've provided:
protoPayload.serviceName="bigquery.googleapis.com"
AND protoPayload.metadata."@type"="type.googleapis.com/google.cloud.audit.BigQueryAuditMetadata"
Hello, I'm attempting to set this up but am not seeing the log tables actually created in my target dataset (I called it bigquery_audit_logs).
I have granted the following permissions to the log agent service account: LoggingAdmin, Bigquery Data Editor, Bigquery User.
My inclusion filter is ```
Do I need to create the tables within the dataset manually, or something?
Here's are potential causes and steps to investigate why your BigQuery audit log tables aren't being created:
Sink Configuration:
Sink Destination: Confirm that the sink points to the correct project and dataset (bigquery_audit_logs
). This ensures tables are created in the intended location.
Parent Resource: Verify that the sink's "parent" is correctly set to the organization or folder intended for log collection. Incorrect settings here could affect log aggregation.
Sink Status: Check the sink's status in the Cloud Logging console to ensure it's active and error-free.
Permissions:
Table Creation Permissions: The roles you've assigned should generally cover the necessary permissions. However, ensure the logging service account has the BigQuery Data Editor
role (roles/bigquery.dataEditor
) on the target dataset. This role implicitly allows for table creation and data insertion.
Log Availability:
Activity Monitoring: Make sure there's ongoing activity in the projects monitored by your sink. Without BigQuery actions, no audit logs (and consequently no tables) will be generated.
Filters:
While your filter appears correctly configured to capture BigQuery audit metadata logs, consider broadening your filter temporarily to see if it affects log flow.
Troubleshooting Steps:
Cloud Logging Console: Inspect the Cloud Logging console for any errors related to your sink. Look for logs or warnings that might indicate issues with log delivery.
IAM Permissions:
Navigate to the IAM page for your dataset (bigquery_audit_logs
).
Ensure the service account used by your logging sink has the BigQuery Data Editor
role.
This role is crucial for allowing the service account to create tables and insert data into BigQuery.
Perform some BigQuery operations within the projects your sink monitors. This can help generate audit logs and trigger table creation.
There might be a delay from when logs are generated to when they are visible in BigQuery due to processing times.
The sink should automatically manage table creation upon receiving the first relevant log entry. Manual creation is not necessary and could complicate the process.
thank you @ms4446 ! This was a log availability problem 😅. I'm good to go now.
Hello, based on the above content, I'm trying to set up an audit log sink in a BigQuery dataset.
An issue persists, causing errors as follows: "Cloud Logging sink configuration error in orgid, sink all_auditlogs_sink_bq: dataset_org_policy_denied ()".
I have granted the service account used by the sink the BigQuery Data Editor role in the project where the BigQuery dataset is located. What could be the problem? Please check.