Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

Restrict users from running expensive queries on Bigquery

Our users include wide range of people like software engineers and some product folks. Often times, they query a table with no proper filters or scan a lot number of partitions.

Want to see if we have a any feature like: 

- If the query doesn't have a partition clause, don't run the query

- if the query scans too many partitions, show a warning dialogue to the user.

Something like this ?

1 2 323
2 REPLIES 2

To implement a feature that controls query execution based on certain criteria like the absence of a partition clause or the scanning of too many partitions, you can take a few approaches depending on the database or data warehouse system you are using. Below are some general strategies and implementation ideas:

Query Analysis and Rejection

Method:

  • Query Pre-Processor: Create a middleware layer that intercepts and analyzes SQL queries before they are executed.
  • Query Parsing: Use a SQL parser to check for the presence of a partition clause and to estimate the number of partitions that will be scanned.
  • Validation Rules: Define rules such as mandatory partition clauses or maximum partitions to be scanned.
  • Feedback Mechanism: Provide immediate feedback to the user if the query violates any rules.

Implementation Steps:

  1. SQL Parsing: Use a library like SQLParse (Python), JSqlParser (Java), or ANTLR to parse the SQL queries.
  2. Rule Check:
    • For partition clause check, look for the presence of WHERE conditions on partition columns.
    • For partition scan check, estimate the number of partitions that will be scanned based on the query.
  3. User Feedback:
    • If no partition clause is found, reject the query with a message.
    • If too many partitions are scanned, show a warning and prompt the user for confirmation.

@Poala_Tenorio is there a guide on how to automatically attach any of these query-preprocessors to the execution flow of a bigquery based? can this be limited to certain IAM roles? Meaning once I have a query-preprocessor configured, how do I ensure that every member of an IAM role must pass the query-preprocessor check before a query is submitted for execution in bigquery independent of the source of the bigquery invocation (API, notebooks, UI queries, etc). Could you provide the steps necessary to achieve this configuration?