Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

429 Resource Exhausted error when using gemini-2.0-Flash with langchain

Hit the same error with `LangChain` `gemini-2.0-flash` `google_vertexai`.
```
vertexai.init(project=os.environ.get("VERTEXAI_PROJECT_ID"), location=os.environ.get("VERTEXAI_PROJECT_LOCATION"))
llm = init_chat_model("gemini-2.0-flash", model_provider="google_vertexai")
embeddings = VertexAIEmbeddings(model="text-embedding-005")
    for message, metadata in graph.stream(
        {"question": "What is Task Decomposition?"}, stream_mode="messages"
    😞
        print(message.content, end="|")
```
```
Retrying langchain_google_vertexai.chat_models._completion_with_retry.<locals>._completion_with_retry_inner in 4.0 seconds as it raised ResourceExhausted: 429 Resource exhausted. Please try again later. Please refer to https://cloud.google.com/vertex-ai/generative-ai/docs/error-code-429 for more details..
Retrying langchain_google_vertexai.chat_models._completion_with_retry.<locals>._completion_with_retry_inner in 4.0 seconds as it raised ResourceExhausted: 429 Resource exhausted. Please try again later. Please refer to https://cloud.google.com/vertex-ai/generative-ai/docs/error-code-429 for more details..
Retrying langchain_google_vertexai.chat_models._completion_with_retry.<locals>._completion_with_retry_inner in 4.0 seconds as it raised ResourceExhausted: 429 Resource exhausted. Please try again later. Please refer to https://cloud.google.com/vertex-ai/generative-ai/docs/error-code-429 for more details..
Retrying langchain_google_vertexai.chat_models._completion_with_retry.<locals>._completion_with_retry_inner in 8.0 seconds as it raised ResourceExhausted: 429 Resource exhausted. Please try again later. Please refer to https://cloud.google.com/vertex-ai/generative-ai/docs/error-code-429 for more details..
Retrying langchain_google_vertexai.chat_models._completion_with_retry.<locals>._completion_with_retry_inner in 10.0 seconds as it raised ResourceExhausted: 429 Resource exhausted. Please try again later. Please refer to https://cloud.google.com/vertex-ai/generative-ai/docs/error-code-429 for more details..
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/google/api_core/grpc_helpers.py", line 170, in error_remapped_callable
    return _StreamingResponseIterator(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3/dist-packages/google/api_core/grpc_helpers.py", line 92, in __init__
    self._stored_first_result = next(self._wrapped)
                                ^^^^^^^^^^^^^^^^^^^
  File "/home/khteh/.local/lib/python3.12/site-packages/grpc/_channel.py", line 543, in __next__
    return self._next()
           ^^^^^^^^^^^^
  File "/home/khteh/.local/lib/python3.12/site-packages/grpc/_channel.py", line 969, in _next
    raise self
grpc._channel._MultiThreadedRendezvous: <_MultiThreadedRendezvous of RPC that terminated with:
status = StatusCode.RESOURCE_EXHAUSTED
details = "Resource exhausted. Please try again later. Please refer to https://cloud.google.com/vertex-ai/generative-ai/docs/error-code-429 for more details."
debug_error_string = "UNKNOWN:Error received from peer ipv4:74.125.68.95:443 {grpc_message:"Resource exhausted. Please try again later. Please refer to https://cloud.google.com/vertex-ai/generative-ai/docs/error-code-429 for more details.", grpc_status:8, created_time:"2025-03-11T12:35:24.732505078+08:00"}"
>
```
As seen in the console, `LangChain` has the retry with exponential backoff logic but still fails after 10 seconds!
 
https://console.cloud.google.com/iam-admin/quotas? is a fxxking mess / Amazon jungle to explore and pinpoint the cause of the issue! There are 12,525 "Quotas and System Limits" in the page! I have not seen any quota being exceeded after scrolling past few pages in the the table!
 
 
 
According to https://aistudio.google.com/prompts/new_chat, there is 15 RPM 1500 req/day for Free tier but the execution above is definitely less than that?
0 0 77
0 REPLIES 0