Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

google.api_core.exceptions.ResourceExhausted: 429 Resource exhausted

Hi Community,

I am blocked trying to get an answer from 

gemini-1.5-flash or 
gemini-1.5-pro

I am doing a simple call. But keep getting inconsistent errors (sometimes it works, most of the time it doesn't).

Error is "google.api_core.exceptions.ResourceExhausted: 429 Resource exhausted. Please try again later."

I tried:
- checking my quotas & limits from the console and everything is in the green with less than 20% usage
- changing region
- check my IAM policy
- checking my API service details:
  • MethodsRequestsErrors
    google.cloud.aiplatform.ui.JobService.ListDataLabelingJobs9100%
    google.cloud.aiplatform.v1.PredictionService.GenerateContent14424.31%
    google.cloud.aiplatform.v1.PredictionService.StreamGenerateContent250%
    google.cloud.aiplatform.v1beta1.GenAiCacheService.GetCachedContent4100%


    The code is very simple

 

 

 

 

 

 

 

 

def fetch_and_save_raw_output(
    prompt: str, uri: str, system_prompt: str
) -> Optional[str]:

    vertexai.init(project=config["project_id"], location=config["location"])
    model = GenerativeModel(config["model_name"], system_instruction=[SYSTEM_PROMPT])

    document = Part.from_uri(mime_type="application/pdf", uri=uri)
    try:
        response = model.generate_content(
            [prompt, document],
            generation_config=generation_config,
            safety_settings=safety_settings,
            stream=False,
        )

        if hasattr(response, "finish_reason"):
            if response.finish_reason == "MAX_TOKENS":
                log.warning(f"Response truncated due to MAX_TOKENS for {uri}")
            elif response.finish_reason != "STOP":
                log.warning(
                    f"Unexpected finish reason: {response.finish_reason} for {uri}"
                )

        # Save raw response with metadata
        pdf_name = extract_filename(uri)
        output_dict = {
            "metadata": {"uri": uri, "timestamp": time.strftime("%Y-%m-%d %H:%M:%S")},
            "response": response.to_dict(),
        }
        print(f"output_dict: {output_dict}")
                return pdf_name
    except Exception as e:
        log.exception(f"Error fetching and saving raw output for {uri}: {str(e)}")
        return None

 

 

 

 

 

 

 

 

0 2 962
2 REPLIES 2

Hi @_Gerald_,

Welcome to Google Cloud Community!

The error ‘google.api_core.exceptions.ResourceExhausted: 429 Resource exhausted’ that you encountered suggests you have hit the quota limit or resource limit of your service account. 

Here are some potential ways to address your issue:

  • Review Quotas and Limits: Since you mentioned that your quotas and limits are in the green zone with less than 20% usage, I recommend double-checking your specific quotas for the services you are utilizing, such as PredictionService and JobService. Some quotas might be exhausted more rapidly than others.
  • Retry strategy: Incorporate recovery logic into your code to handle transient errors by using exponential backoff to delay and retry the request.
  • API Key and Project Setup: Make sure that your API key and project configuration are properly set up. Misconfigurations can sometimes result in resource exhaustion errors.
  • Monitor API rate limits: Some APIs have restrictions on the number of requests you can make within a specific time frame. Ensure you don't hit these limits.

I hope the above information is helpful.

 

 

Hi,
You can also test Gemini Flash 1.5 version 001 which worked for us.
Another soluton is to buy dedicated GSU's with Privisioned Throughput (PT), or test pay-as-you-go with another region. But if it is a Production environment, the suggestion we received is to use PT.