Constant 429 Errors with gemini-2.0-flash-thinking-exp-01-21 API on Vertex AI

Hi everyone,

I'm encountering consistent 429 errors (rate limit exceeded) when using the Vertex AI gemini-2.0-flash-thinking-exp-01-21 API. Here’s a brief summary of my situation:

Yesterday, I used Vertex AI to process data with the gemini-2.0-flash-thinking-exp-01-21 API.
After processing around 5000 data items, I started receiving 429 errors.
When I interrupted the task and restarted it, all subsequent requests returned a 429 error.
My code employs a ThreadPoolExecutor from Python’s concurrent.futures with max_workers set to 32. Even reducing max_workers to 16 doesn't alleviate the issue.

Has anyone experienced similar issues or can offer insights into why this might be happening?

Thanks in advance for your help!

0 2 2,261

2 REPLIES 2

never-displayed