Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

Constant 429 Errors with gemini-2.0-flash-thinking-exp-01-21 API on Vertex AI

Hi everyone,

I'm encountering consistent 429 errors (rate limit exceeded) when using the Vertex AI gemini-2.0-flash-thinking-exp-01-21 API. Here’s a brief summary of my situation:

  • Yesterday, I used Vertex AI to process data with the gemini-2.0-flash-thinking-exp-01-21 API.

  • After processing around 5000 data items, I started receiving 429 errors.

  • When I interrupted the task and restarted it, all subsequent requests returned a 429 error.

  • My code employs a ThreadPoolExecutor from Python’s concurrent.futures with max_workers set to 32. Even reducing max_workers to 16 doesn't alleviate the issue.

 Has anyone experienced similar issues or can offer insights into why this might be happening?

Thanks in advance for your help!

2 REPLIES 2