Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

Error message 429 when using Claude dev with GCP Vertex AI

Recently I added GCP Vertex AI as an API Provider in Claude dev.

But I got this error message;
"429 {"error":{"code":429,"message":"Quota exceeded for aiplatform.googleapis.com/online_prediction_requests_per_base_model with base model: anthropic-claude-3-5-sonnet. Please submit a quota increase request. https://cloud.google.com/vertex-ai/docs/generative-ai/quotas-genai.","status":"RESOURCE_EXHAUSTED"}}"

By the way, when I check in Quota & System limits, the region and base_model I used value is Unlimited.

Please help me on this, thanks.

0 3 392
3 REPLIES 3

Hi @Tommm07,

Welcome to Google Cloud Community!

Could you confirm if the region you're using is listed in this documentation? This is because there are regions that uses dynamic shared quota, which means resources are shared among users. While there's no individual quota assigned, it's possible for resources to be temporarily unavailable due to high demand which may be the case on your end. As a workaround, you may consider implementing exponential backoff for retries to reduce load in the API.

You may also consider quota increase request by contacting Google Sales Team here.

Hope this helps.

We are facing the same issue and just trying to issue a single invocation in the europe-west1 and us-central1 regions. The documentation link you shared above doesn't give any information about using Claude and what regions are valid for this model. 

Hi,

We are facing the same issue on trying to issue a single invocation to the Claude model. The documentation link you shared doesn't outline what regions this model supports.