{
"code": 429,
"message": "Quota exceeded for aiplatform.googleapis.com/online_prediction_concurrent_requests_per_base_model. Please submit a quota increase request.",
"status": "RESOURCE_EXHAUSTED"
}
-----
Solved! Go to Solution.
It is possible that the resource for that region is already exhausted, Can you try calling it from a different region ? Also you can try to request Quota increase in IAM Quotas page the little pencil icon at the upper right of the console "EDIT QUOTAS"
Manage your Quotas: https://cloud.google.com/docs/quotas/view-manage
Regarding this issue, I have conducted extensive testing, including adding additional projects and billing accounts. However, as you mentioned, it appears that the resources in the specific region are indeed exhausted.
(Despite minimal usage, I observed instances where resources were unavailable, while at other times, extensive usage did not cause any issues.)
I would also like to add that I was unable to find any information regarding this specific quota ("concurrent_requests_per_base_model") within the quota management section.
Therefore, based on the assumption of regional resource depletion, I have structured my system to utilize a combination of European and US regions, along with Anthropic's native API.
Thank you for your assistance.
User | Count |
---|---|
2 | |
2 | |
1 | |
1 | |
1 |