Solved: BigQuery Quota Limit understanding - Page 2

ammar_hanif · 08-18-2023 07:59 AM

Hi,

I'm using bigquery, interactive queries (not scheduled or nor API) to perform task that requires too many loops, hence too many temp table creation, querying table and making INSERT statements into the already created table. The sample_uc table has around 250,000 rows and 6 columns. However, the original uc table would have 600,000 entries and 6 columns.

Although things are running fine for now, since I'm limiting the loop for max 20 iterations. But what would happen if don't limit iterations? When I use the original (600,000 entries) table?

Will this prompt any error due to the quota limitations? for INSERT (around 5600 INSERT instances required), querying table and creation and drop of temporary tables?

I'm using below query for reference

DECLARE unique_material ARRAY<STRING>;

DECLARE index INT64 DEFAULT 0;

DECLARE current_value STRING;

DECLARE part_names_query STRING;

DECLARE all_part_names STRING;

CREATE TEMP TABLE parts AS (

SELECT * FROM `nebbiu-data-hrp.testing.sample_uc`

WHERE Option__Variant_ NOT LIKE 'UNIT%' OR Option__Variant_ LIKE 'UNIT_B0000'

);

-- CREATE TEMP TABLE base_models AS (

-- SELECT * FROM `nebbiu-data-hrp.testing.sample_uc`

-- WHERE Option__Variant_ LIKE 'UNIT_B0000'

-- );

CREATE TEMP TABLE variant_models AS (

SELECT * FROM `nebbiu-data-hrp.testing.sample_uc`

WHERE Option__Variant_ LIKE 'UNIT%' AND Option__Variant_ NOT LIKE 'UNIT_B0000'

);

SET unique_material = (

SELECT ARRAY_AGG(DISTINCT Material) FROM parts

);

EXECUTE IMMEDIATE """

SELECT STRING_AGG(DISTINCT CONCAT("'",Option__Variant_,"'")) FROM parts """

INTO all_part_names;

-- WHILE index <= ARRAY_LENGTH(unique_material) DO

WHILE index <= 9 DO

SET current_value = unique_material[SAFE_OFFSET(index)];

CREATE TEMP TABLE filtered_table AS (

SELECT Option__Variant_, Future_Price, CONCAT(current_value,'B0000') AS base_model, 'B0000' AS option_code, NULL AS list_price_ FROM parts

WHERE Material = current_value);

-- SET part_names = REGEXP_REPLACE(part_names, r'[./-]', '_');

EXECUTE IMMEDIATE FORMAT("""

INSERT INTO `nebbiu-data-hrp.testing.processed`

SELECT * FROM filtered_table

PIVOT (SUM(Future_Price) FOR Option__Variant_ IN (%s)) """

,all_part_names);

DROP TABLE filtered_table;

SET index = index + 1;

END WHILE;

EXECUTE IMMEDIATE """

SELECT STRING_AGG(DISTINCT CONCAT("NULL AS ",Option__Variant_)) FROM parts """

INTO part_names_query;

EXECUTE IMMEDIATE FORMAT ("""

INSERT INTO `nebbiu-data-hrp.testing.processed`

SELECT REGEXP_REPLACE(VarKey,r'.UNIT.','') AS base_model, REGEXP_REPLACE(Option__Variant_,r'UNIT_','') AS option_code, Future_Price AS list_price_, %s FROM variant_models LIMIT 100 """

,part_names_query);

CREATE OR REPLACE TABLE `nebbiu-data-hrp.testing.processed` AS

SELECT *,

COALESCE(list_price_,UNIT_B0000) AS list_price,

FROM `nebbiu-data-hrp.testing.processed`;

ALTER TABLE `nebbiu-data-hrp.testing.processed`

DROP COLUMN list_price_,

DROP COLUMN UNIT_B0000

ms4446

50 Concurrent Interactive Queries

This means that you can have a maximum of 50 queries running at the same time in BigQuery. If you try to run a 51st query while 50 are already running, it will be placed in a queue and will not start until one of the other queries finishes.

If your loop is doing 1 query per iteration and you have 50 iterations, then you will be able to run your loop 50 times before you hit the quota. However, it's worth noting that if a user is running other queries outside of the loop, they could hit the limit even if the loop has fewer than 50 iterations.

Solutions

There are a few ways to get around the 50 concurrent interactive queries quota:

Use a batch job: Batch jobs are queued and executed based on resource availability. They can run for longer periods of time and can handle more queries than interactive queries. To use a batch job, you would need to write a script that contains the queries you want to run. You can then schedule the script to run using the BigQuery scheduler.
Optimize the query: You can try to optimize your query to run faster. This could involve partitioning or clustering tables, using more efficient join algorithms, or using BigQuery's built-in functions.
Use a smaller sample: If you're looking for a rough estimate or preliminary results, you could use a smaller sample of your data. However, for accurate results, you'll need to use the entire dataset.

View solution in original post