429 Resource Exhausted error when using gemini-2.0...

khteh · 03-11-2025 12:42 AM

Hit the same error with `LangChain` `gemini-2.0-flash` `google_vertexai`.

```

vertexai.init(project=os.environ.get("VERTEXAI_PROJECT_ID"), location=os.environ.get("VERTEXAI_PROJECT_LOCATION"))

llm = init_chat_model("gemini-2.0-flash", model_provider="google_vertexai")

embeddings = VertexAIEmbeddings(model="text-embedding-005")

for message, metadata in graph.stream(

{"question": "What is Task Decomposition?"}, stream_mode="messages"

😞

print(message.content, end="|")

```

Retrying langchain_google_vertexai.chat_models._completion_with_retry.<locals>._completion_with_retry_inner in 4.0 seconds as it raised ResourceExhausted: 429 Resource exhausted. Please try again later. Please refer to https://cloud.google.com/vertex-ai/generative-ai/docs/error-code-429 for more details..

Retrying langchain_google_vertexai.chat_models._completion_with_retry.<locals>._completion_with_retry_inner in 8.0 seconds as it raised ResourceExhausted: 429 Resource exhausted. Please try again later. Please refer to https://cloud.google.com/vertex-ai/generative-ai/docs/error-code-429 for more details..

Retrying langchain_google_vertexai.chat_models._completion_with_retry.<locals>._completion_with_retry_inner in 10.0 seconds as it raised ResourceExhausted: 429 Resource exhausted. Please try again later. Please refer to https://cloud.google.com/vertex-ai/generative-ai/docs/error-code-429 for more details..

Traceback (most recent call last):

File "/usr/lib/python3/dist-packages/google/api_core/grpc_helpers.py", line 170, in error_remapped_callable

return _StreamingResponseIterator(

^^^^^^^^^^^^^^^^^^^^^^^^^^^

File "/usr/lib/python3/dist-packages/google/api_core/grpc_helpers.py", line 92, in __init__

self._stored_first_result = next(self._wrapped)

^^^^^^^^^^^^^^^^^^^

File "/home/khteh/.local/lib/python3.12/site-packages/grpc/_channel.py", line 543, in __next__

return self._next()

^^^^^^^^^^^^

File "/home/khteh/.local/lib/python3.12/site-packages/grpc/_channel.py", line 969, in _next

raise self

grpc._channel._MultiThreadedRendezvous: <_MultiThreadedRendezvous of RPC that terminated with:

status = StatusCode.RESOURCE_EXHAUSTED

details = "Resource exhausted. Please try again later. Please refer to https://cloud.google.com/vertex-ai/generative-ai/docs/error-code-429 for more details."

debug_error_string = "UNKNOWN:Error received from peer ipv4:74.125.68.95:443 {grpc_message:"Resource exhausted. Please try again later. Please refer to https://cloud.google.com/vertex-ai/generative-ai/docs/error-code-429 for more details.", grpc_status:8, created_time:"2025-03-11T12:35:24.732505078+08:00"}"

>

```

As seen in the console, `LangChain` has the retry with exponential backoff logic but still fails after 10 seconds!

https://console.cloud.google.com/iam-admin/quotas? is a fxxking mess / Amazon jungle to explore and pinpoint the cause of the issue! There are 12,525 "Quotas and System Limits" in the page! I have not seen any quota being exceeded after scrolling past few pages in the the table!

![Image](https://github.com/user-attachments/assets/3c4ad6f6-e973-4d90-800d-ba8661439008)

https://github.com/langchain-ai/langchain/issues/22241

According to https://aistudio.google.com/prompts/new_chat, there is 15 RPM 1500 req/day for Free tier but the execution above is definitely less than that?

429 Resource Exhausted error when using gemini-2.0-Flash with langchain