RESOURCE_EXHAUSTED error occurred (text bison batch prediction)

I have triggered batch prediction job for text bison@001 model with size of  22500 prompts for each job. My job takes 1 hr in queue and 3hrs to complete but in the final result in the content.prediction I see error RESOURCE_EXHAUSTED error occurred. What could be the possible reason behind this? Do I need to submit quota increase for this ? 

3 REPLIES 3

New Updates, I tried to cut down on the chunk size to submit multiple batch prediction jobs to run with lower number of prompts and I saw I had to update the quota Concurrent large language model batch prediction jobs running on text-bison model per region from default 4 to 10.
Still the part of resource exaustion is a mystery to me as we don't get any option to customize machine configuration for text-bison batch predictions in order to avoid OOM issues. 

Any guidelines/tips are much appreciated to handle Resource Exhausted issues in batch prediction when dealing with LLM such as text bison.

The error you're encountering, "RESOURCE_EXHAUSTED", typically indicates that the resources allocated for the job are insufficient to complete the task. In the context of Google Cloud Platform (GCP), where AI Platform Batch Prediction jobs are often executed, this error commonly arises due to limitations on resources such as memory or CPU.

Here are some potential reasons for this error:

  • Your batch prediction job might be using more memory or CPU than what is allocated to it. This could happen if the model you're using is particularly resource-intensive or if the dataset you're processing is larger than expected.
  • Your GCP project might have quota limits set on the resources that can be used for AI Platform Batch Prediction jobs. If your job exceeds these limits, you'll need to request a quota increase from Google Cloud Console.
  • Data processing inefficiencies: The data preprocessing or post-processing steps might be inefficient, leading to increased resource usage. Reviewing and optimizing these steps could help alleviate the issue.

To address this issue, you can take the following steps:

  • Check the logs and metrics of your batch prediction job to identify which specific resource is being exhausted.
  • Review and optimize the configuration of your model and job parameters.
  • If necessary, request a quota increase for the relevant resources from Google Cloud Console.
  • Consider optimizing your data processing pipeline to reduce resource usage.

Hello Poala, 

Thanks for your informative response. This workaround only works for custom trained models / third party models where we can customize the machine configuration to high mem Nodes. However, I am using google's foundational model text bison using the batch prediction wrapper provided by Vertex AI SDK. 

Now coming to quotas, the only available quota on GCP for text bison is Concurrent Large Language Model batch prediction jobs which doesn't solve the Resource Exhuasted Error. 

Regarding post processing, I have isolated steps such that I am not performing exhaustive steps in one component. Hence I just invoke batch prediction job thats it.