Hi everyone,
I'm encountering consistent 429 errors (rate limit exceeded) when using the Vertex AI gemini-2.0-flash-thinking-exp-01-21 API. Here’s a brief summary of my situation:
Yesterday, I used Vertex AI to process data with the gemini-2.0-flash-thinking-exp-01-21 API.
After processing around 5000 data items, I started receiving 429 errors.
When I interrupted the task and restarted it, all subsequent requests returned a 429 error.
My code employs a ThreadPoolExecutor from Python’s concurrent.futures with max_workers set to 32. Even reducing max_workers to 16 doesn't alleviate the issue.
Has anyone experienced similar issues or can offer insights into why this might be happening?
Thanks in advance for your help!