Hi,
I'm using bigquery, interactive queries (not scheduled or nor API) to perform task that requires too many loops, hence too many temp table creation, querying table and making INSERT statements into the already created table. The sample_uc table has around 250,000 rows and 6 columns. However, the original uc table would have 600,000 entries and 6 columns.
Although things are running fine for now, since I'm limiting the loop for max 20 iterations. But what would happen if don't limit iterations? When I use the original (600,000 entries) table?
Will this prompt any error due to the quota limitations? for INSERT (around 5600 INSERT instances required), querying table and creation and drop of temporary tables?
I'm using below query for reference
Solved! Go to Solution.
50 Concurrent Interactive Queries
This means that you can have a maximum of 50 queries running at the same time in BigQuery. If you try to run a 51st query while 50 are already running, it will be placed in a queue and will not start until one of the other queries finishes.
If your loop is doing 1 query per iteration and you have 50 iterations, then you will be able to run your loop 50 times before you hit the quota. However, it's worth noting that if a user is running other queries outside of the loop, they could hit the limit even if the loop has fewer than 50 iterations.
Solutions
There are a few ways to get around the 50 concurrent interactive queries quota:
Yes, you are likely to run into quota limitations if you do not limit the iterations of your loop.
The following are the quotas that may be affected:
To avoid hitting these quotas, you can:
sample_uc
table.Additional considerations:
Hi ms4446,
Many thanks for your quick response.
Refer to your above response, I do have following queries if you could help me in this.
- What is meant by '50 concurrent interactive queries'? Does it means the total number of queries run in a single query window/ session? If yes, so if my loop is doing 1 query per iteration this means I cannot even run my loop more than 50 times?
- under the suggestions for you can:
Unfortunately, I cannot limit the iterations since I'm restricted to run them like the number of time I found unique material (may be I need to change entire query logic for this - PIVOT thing is really causing issue to me).
Furthermore, sample_uc might run fine but I'm required to run it on actually (larger) dataset, thus causing concerns.
50 Concurrent Interactive Queries
This means that you can have a maximum of 50 queries running at the same time in BigQuery. If you try to run a 51st query while 50 are already running, it will be placed in a queue and will not start until one of the other queries finishes.
If your loop is doing 1 query per iteration and you have 50 iterations, then you will be able to run your loop 50 times before you hit the quota. However, it's worth noting that if a user is running other queries outside of the loop, they could hit the limit even if the loop has fewer than 50 iterations.
Solutions
There are a few ways to get around the 50 concurrent interactive queries quota:
hello, the 50 concurrent queries per user, is it documented somewhere in the official BigQuery quotas/limits as I cant find it in the list? Where is this officially stated please?
Thank you
I apologize for the confusion. The information given the previous post is outdated
BigQuery uses a queuing system for both batch and interactive queries, which can be found in the official documentation: https://cloud.google.com/bigquery/quotas:
Key Points:
Hello, thank you for taking time for response
I found more info in this thread which seems up to date
https://blog.fourninecloud.com/bigquery-query-queues-query-concurrency-in-your-control-c60411868997
in general with pay as you go mode (on demand) it looks we wont know what is the potential of concurrent queries as it always depends on the available slots at a given moment,
it could be 10, it could be 300 ...the rest goes to queue up to 1000 items in the queue
with reservation i noticed there is parameter where we can try to indicate what we want as concurrency but still need enough slots to make it happen
Regards
Zdenek
Your understanding and summary of the information from the blog post about BigQuery's query concurrency seem accurate based on general knowledge of how cloud services like BigQuery operate:
On-demand (Pay as you go):
Reservations:
Key Takeaway: BigQuery's dynamic concurrency and queuing mechanism is designed to allow many queries to run without strict per-user concurrency limits. The queuing system allows BigQuery to manage workloads and avoid overloading the system, even if there are sudden spikes in query demand.
@ammar_hanif wrote:Hi,
I'm using bigquery, interactive queries (not scheduled or nor API) to perform task that requires too many loops, hence too many temp table creation, querying table and making INSERT statements into the already created table. The sample_uc table has around 250,000 rows and 6 columns. However, the original uc table would have 600,000 entries and 6 columns.
Although things are running fine for now, since I'm limiting the loop for max 20 iterations. But what would happen if don't limit iterations? When I use the original (600,000 entries) table?
Will this prompt any error due to the quota limitations? for INSERT (around 5600 INSERT instances required), querying table and creation and drop of temporary tables?
I'm using below query for referenceDECLARE unique_material ARRAY<STRING>;DECLARE index INT64 DEFAULT 0;DECLARE current_value STRING;DECLARE part_names_query STRING;DECLARE all_part_names STRING;CREATE TEMP TABLE parts AS (SELECT * FROM `nebbiu-data-hrp.testing.sample_uc`WHERE Option__Variant_ NOT LIKE 'UNIT%' OR Option__Variant_ LIKE 'UNIT_B0000');-- CREATE TEMP TABLE base_models AS (-- SELECT * FROM `nebbiu-data-hrp.testing.sample_uc`-- WHERE Option__Variant_ LIKE 'UNIT_B0000'-- );CREATE TEMP TABLE variant_models AS (SELECT * FROM `nebbiu-data-hrp.testing.sample_uc`WHERE Option__Variant_ LIKE 'UNIT%' AND Option__Variant_ NOT LIKE 'UNIT_B0000');SET unique_material = (SELECT ARRAY_AGG(DISTINCT Material) FROM parts);EXECUTE IMMEDIATE """SELECT STRING_AGG(DISTINCT CONCAT("'",Option__Variant_,"'")) FROM parts """INTO all_part_names;-- WHILE index <= ARRAY_LENGTH(unique_material) DOWHILE index <= 9 DOSET current_value = unique_material[SAFE_OFFSET(index)];CREATE TEMP TABLE filtered_table AS (SELECT Option__Variant_, Future_Price, CONCAT(current_value,'B0000') AS base_model, 'B0000' AS option_code, NULL AS list_price_ FROM partsWHERE Material = current_value);-- SET part_names = REGEXP_REPLACE(part_names, r'[./-]', '_');EXECUTE IMMEDIATE FORMAT("""INSERT INTO `nebbiu-data-hrp.testing.processed`SELECT * FROM filtered_tablePIVOT (SUM(Future_Price) FOR Option__Variant_ IN (%s)) """,all_part_names);DROP TABLE filtered_table;SET index = index + 1;END WHILE;EXECUTE IMMEDIATE """SELECT STRING_AGG(DISTINCT CONCAT("NULL AS ",Option__Variant_)) FROM parts """INTO part_names_query;EXECUTE IMMEDIATE FORMAT ("""INSERT INTO `nebbiu-data-hrp.testing.processed`SELECT REGEXP_REPLACE(VarKey,r'.UNIT.','') AS base_model, REGEXP_REPLACE(Option__Variant_,r'UNIT_','') AS option_code, Future_Price AS list_price_, %s FROM variant_models LIMIT 100 """,part_names_query);CREATE OR REPLACE TABLE `nebbiu-data-hrp.testing.processed` ASSELECT *,COALESCE(list_price_,UNIT_B0000) AS list_price,FROM `nebbiu-data-hrp.testing.processed`;ALTER TABLE `nebbiu-data-hrp.testing.processed`DROP COLUMN list_price_,DROP COLUMN UNIT_B0000
Running a large number of iterations with temporary tables, insert statements, and queries in BigQuery may lead to various challenges. Some factors to consider are:
Quota Limitations: BigQuery has various quotas and limitations, such as the number of concurrent queries, query processing time, and maximum number of temporary tables. Running too many iterations may hit these limitations and result in errors.
Costs: Executing numerous queries and creating temporary tables can contribute to increased costs, especially if you are dealing with large datasets and performing complex operations.
Performance: As the number of iterations increases, the overall performance of your queries may degrade. You may experience longer execution times and higher resource consumption.
Error Handling: Without proper error handling mechanisms, your script might fail if any individual query or iteration encounters an issue.
Code Complexity: The more iterations you have, the more complex your code becomes. This can make it harder to maintain, debug, and optimize.
To address these concerns:
Additionally, when dealing with large datasets, it's always advisable to test your scripts on a smaller scale before scaling up to avoid unexpected issues.
Remember to review BigQuery documentation and monitor your project's usage to ensure you stay within the limits and quotas imposed by the service.