Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

BigQuery Storage API Billing Breakdown

Hello! Is there any way to breakdown BigQuery Storage API costs further than SKU.description? Got quite big amount of 'BigQuery Storage API Network Internet Egress Europe to Europe' but want to know which tables involved here, which users query API with what queries. Any way to find it out? 

1 2 881
2 REPLIES 2

Getting a detailed breakdown of BigQuery Storage API costs at the table or user level (with query visibility) within the Google Cloud Platform's native tools can be challenging. The SKU.description field provides only a high-level categorization.

Here are strategies you can combine to gain more granular insights:

1. Audit Logs:

  • Enable Cloud Audit Logs for BigQuery (if not already enabled).
  • Examine the audit logs for "BigQuery Storage Read API" operations. The logs can provide:
    • Username or service account initiating the API call.
    • Project ID and dataset ID associated with the accessed table.
    • Timestamps of API calls.
  • Note that audit logs might not capture the exact SQL query text.

2. Custom Logging using the Storage API Client Libraries:

  • If you are using client libraries (e.g., Python, Java) for Storage API interaction, instrument your code to log:
    • Table names being accessed.
    • Usernames or identifiers associated with the requests.
    • Potentially, the query text or a summary of the query.

3. Leverage Job Information and Query History:

  • Analyze BigQuery Job History. Look for queries associated with high data processing volumes.
  • Correlate Job IDs from history with timestamps in Audit Logs or your custom logging to tie queries to storage usage.
  • This approach will provide query-associated table information but might not directly reflect the precise Storage API egress costs.

4. Consider Cost Allocation Labels:

  • Apply cost allocation labels to BigQuery jobs or tables. This can help you track costs at a project, team, or application level but again, not directly at the Storage API usage level.

5. Usage Analysis (Manual or Tool-assisted):

  • Manually examine your Cloud Billing data for patterns. Are there specific periods with high 'BigQuery Storage API Network Internet Egress Europe to Europe' costs?
  • Try to correlate those periods with activity in your projects or known query patterns.
  • Consider building or utilizing third-party tools that can parse your billing data and audit logs to provide more targeted analytics.

Challenges and Limitations:

  • Data Aggregation: GCP billing and Audit Logs might not provide data at the desired level of granularity out of the box. You might need to aggregate and correlate data from multiple sources.

  • Complexity: Setting up detailed audit logs and custom logging requires effort and maintenance.

  • Potential Overhead: Adding extensive logging can introduce some overhead to your system.

Thank you for detailed answer. I'll investigate into audit logs.

Already looked through job history but there's almost no extract queries.

We have quite big and old project and didn't have list of clients who can read data from BQ. And at this time have StorageAPI billing to be in a half of Analyses. Maybe it's ok but we need to be sure that some forgotten and unneeded service generate costs.