{
"code": 429,
"message": "Quota exceeded for aiplatform.googleapis.com/online_prediction_concurrent_requests_per_base_model. Please submit a quota increase request.",
"status": "RESOURCE_EXHAUSTED"
}
-----
Solved! Go to Solution.
It is possible that the resource for that region is already exhausted, Can you try calling it from a different region ? Also you can try to request Quota increase in IAM Quotas page the little pencil icon at the upper right of the console "EDIT QUOTAS"
Manage your Quotas: https://cloud.google.com/docs/quotas/view-manage
Regarding this issue, I have conducted extensive testing, including adding additional projects and billing accounts. However, as you mentioned, it appears that the resources in the specific region are indeed exhausted.
(Despite minimal usage, I observed instances where resources were unavailable, while at other times, extensive usage did not cause any issues.)
I would also like to add that I was unable to find any information regarding this specific quota ("concurrent_requests_per_base_model") within the quota management section.
Therefore, based on the assumption of regional resource depletion, I have structured my system to utilize a combination of European and US regions, along with Anthropic's native API.
Thank you for your assistance.
It is possible that the resource for that region is already exhausted, Can you try calling it from a different region ? Also you can try to request Quota increase in IAM Quotas page the little pencil icon at the upper right of the console "EDIT QUOTAS"
Manage your Quotas: https://cloud.google.com/docs/quotas/view-manage
Regarding this issue, I have conducted extensive testing, including adding additional projects and billing accounts. However, as you mentioned, it appears that the resources in the specific region are indeed exhausted.
(Despite minimal usage, I observed instances where resources were unavailable, while at other times, extensive usage did not cause any issues.)
I would also like to add that I was unable to find any information regarding this specific quota ("concurrent_requests_per_base_model") within the quota management section.
Therefore, based on the assumption of regional resource depletion, I have structured my system to utilize a combination of European and US regions, along with Anthropic's native API.
Thank you for your assistance.
I have the same problem. i have two unconnected accounts. its works perfectly on the one but not the other. i have sent emails and spoke to customer care. sofar no update or fix. they are both paid accounts that are fully activated. any help will be appreciated. here is my error: raise self._make_status_error_from_response(err.response) from None
anthropic.RateLimitError: Error code: 429 - [{'error': {'code': 429, 'message': 'Quota exceeded for aiplatform.googleapis.com/online_prediction_requests_per_base_model with base model: anthropic-claude-3-5-sonnet. Please submit a quota increase request. https://cloud.google.com/vertex-ai/docs/generative-ai/quotas-genai.', 'status': 'RESOURCE_EXHAUSTED'}}]
Hello,
I noticed your recent inquiry regarding an issue I had previously posted about. I wanted to provide an update on the matter.
At the time of my original post, I was developing an application using Claude 3.5 Sonnet. This choice was made because Sonnet was state-of-the-art at that time, and there weren't many alternatives offering comparable performance for our needs.
I'm pleased to share that I've since resolved the issue. Currently, I'm successfully implementing the desired functionality using Gemini Pro and Flash.
Regarding the error message you encountered, based on the information I gathered from various sources at the time, it was likely due to insufficient resources in the specific GCP region. However, I'm not certain if this particular issue still persists.
User | Count |
---|---|
2 | |
2 | |
1 | |
1 | |
1 |