Hit the same error with `LangChain` `gemini-2.0-flash` `google_vertexai`.
```
vertexai.init(project=os.environ.get("VERTEXAI_PROJECT_ID"), location=os.environ.get("VERTEXAI_PROJECT_LOCATION"))
llm = init_chat_model("gemini-2.0-flash", model_provider="google_vertexai")
embeddings = VertexAIEmbeddings(model="text-embedding-005")
for message, metadata in graph.stream(
{"question": "What is Task Decomposition?"}, stream_mode="messages"
😞
print(message.content, end="|")
```
```
Traceback (most recent call last):
File "/usr/lib/python3/dist-packages/google/api_core/grpc_helpers.py", line 170, in error_remapped_callable
return _StreamingResponseIterator(
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3/dist-packages/google/api_core/grpc_helpers.py", line 92, in __init__
self._stored_first_result = next(self._wrapped)
^^^^^^^^^^^^^^^^^^^
File "/home/khteh/.local/lib/python3.12/site-packages/grpc/_channel.py", line 543, in __next__
return self._next()
^^^^^^^^^^^^
File "/home/khteh/.local/lib/python3.12/site-packages/grpc/_channel.py", line 969, in _next
raise self
grpc._channel._MultiThreadedRendezvous: <_MultiThreadedRendezvous of RPC that terminated with:
status = StatusCode.RESOURCE_EXHAUSTED
>
```
As seen in the console, `LangChain` has the retry with exponential backoff logic but still fails after 10 seconds!
https://console.cloud.google.com/iam-admin/quotas? is a fxxking mess / Amazon jungle to explore and pinpoint the cause of the issue! There are 12,525 "Quotas and System Limits" in the page! I have not seen any quota being exceeded after scrolling past few pages in the the table!