Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

Vertex Claude Models get 'Quota exceeded'

I just tried out the claude sonnet and haiku models released for public use on vertex ai. However, I got this error: "RateLimitError: Error code: 429 - 'error': {'code': 429, 'message': 'Quota exceeded for aiplatform.googleapis.com/online_prediction_output_tokens_per_minute_per_base_model with base model: anthropic-claude-3-sonnet. status': 'RESOURCE_EXHAUSTED'""". 

Is it possible to increase it?

2 5 3,045
5 REPLIES 5

The error message you received indicates that the quota for generating text using Claude Haiku's API has been exceeded.

Rate limits of Claude API are currently measured in requests per minute, tokens per minute, and tokens per day based on the documentation.

You can also take a look at this question indicating the same error.

The default value for all claude models in the quotas page is 0. And I cannot change it to any value above zero. 

Screenshot 2024-03-26 at 10.01.21 AM.png

Screenshot 2024-03-26 at 10.02.33 AM.png

I was able to submit a quota increase. I'm on a paid plan, so you might need to upgrade to that.

I'm also on a paid plan - mind sharing how you did it?

Absolutely. Go to the quota and system limits section under IAM and admins. For service search Vertex AI API and Dimensions search anthropic. Find the model you want in the region you want and check the box and click edit quota

Screenshot 2024-03-27 at 12.29.00 PM.png