Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

Receiving quota error when trying to use bison chat model in Vertex AI

Hi,

     I am working on Vertex AI PALM API and getting the following error:

Really appreciate any he

 

 

Waiting
.....
---------------------------------------------------------------------------
_InactiveRpcError                         Traceback (most recent call last)
/usr/local/lib/python3.10/dist-packages/google/api_core/grpc_helpers.py in error_remapped_callable(*args, **kwargs)
     71         try:
---> 72             return callable_(*args, **kwargs)
     73         except grpc.RpcError as exc:

10 frames
_InactiveRpcError: <_InactiveRpcError of RPC that terminated with:
	status = StatusCode.RESOURCE_EXHAUSTED
	details = "Quota exceeded for aiplatform.googleapis.com/online_prediction_requests_per_base_model with base model: textembedding-gecko. Please submit a quota increase request. https://cloud.google.com/vertex-ai/docs/quotas."
	debug_error_string = "UNKNOWN:Error received from peer ipv4:142.251.16.95:443 {created_time:"2023-07-30T08:53:03.982677819+00:00", grpc_status:8, grpc_message:"Quota exceeded for aiplatform.googleapis.com/online_prediction_requests_per_base_model with base model: textembedding-gecko. Please submit a quota increase request. https://cloud.google.com/vertex-ai/docs/quotas."}"
>

The above exception was the direct cause of the following exception:

ResourceExhausted                         Traceback (most recent call last)
/usr/local/lib/python3.10/dist-packages/google/api_core/grpc_helpers.py in error_remapped_callable(*args, **kwargs)
     72             return callable_(*args, **kwargs)
     73         except grpc.RpcError as exc:
---> 74             raise exceptions.from_grpc_error(exc) from exc
     75 
     76     return error_remapped_callable

ResourceExhausted: 429 Quota exceeded for aiplatform.googleapis.com/online_prediction_requests_per_base_model with base model: textembedding-gecko. Please submit a quota increase request. https://cloud.google.com/vertex-ai/docs/quotas.

 

 

0 1 649
1 REPLY 1

Good day @Alpharasika,

Welcome to Google Cloud Community!

Based from the error that you are encountering, you have already reached the quota limit for online prediction requests per based model, in this case, you need to request a quota increase for your project: https://cloud.google.com/docs/quota#requesting_higher_quota
Here is a guide on how to request a higher quota limit: https://cloud.google.com/docs/quota_detail/view_manage#requesting_higher_quota
Your request will be processed around 2 to 3 days and Cloud Customer Care will send an email if your request is approved or denied.

Hope this helps!