Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

Error 429 when using Vertex AI API, even though my usage is far below quota

Hi, I'm very frequently getting Error 429 when using Vertex AI API, even though my usage is far below quota for online prediction requests per minute when using gemini-2.0-flash model.

mariuszknowak_0-1740690657085.png

How can I remediate this issue (I'm on a paid tier already)?

Which other LLM (or other location) available at GCloud would you suggest to increase throughput?

As the next step in my project, I wanted to try some genetic algorithms for prompt optimization - unfortunately with so frequent 429 errors it is practically impossible.

0 1 195
1 REPLY 1