504 Failed to create embedding quota checker after...

EtienneC · 05-19-2025 02:28 AM

Hi,

I'm experimenting with RAG using the vertexai Python library. I was able to to create a corpus and import files a few days at first. But for a few days, all my attempots to import new files have ended up with an error message (I'm still have to create / inspect / delete corpora... just not populate them):

DeadlineExceeded: 504 Failed to create embedding quota checker after 60s. Global embedding quota pool are out of quota. 4: Failed to create embedding quota checker after 60s. Global embedding quota pool are out of quota.

I've examined quota in IAM / Quotas.. but I don't see any quota that I might be hitting (the first import attempt of a small filer after > 24h inactivity fails...)

I'm using this piece code as a reference : https://cloud.google.com/vertex-ai/generative-ai/docs/samples/generativeaionvertexai-rag-import-file... (corpus was created with publisher_model = 'publishers/google/models/text-multilingual-embedding-002')

Any hint welcome!

kevinduck

Same problem here. Tried different regions, creating a new corpus, different models, but get the same error "Global embedding quota pool are out of quota"

caryna

Hi @EtienneC,

Welcome to Google Cloud Community!

The possible cause can be also related to network issues which may not execute the operation requested or you might set a deadline that is shorter than the default processing time, as stated on this documentation. I would recommend increasing the timeout or higher if possible and apply retries with exponential backoff to handle failed request.

If this will not work, I’d recommend reaching out to Google Cloud Support for further clarification. Otherwise, open an issue report regarding this so that our Engineering Team can look into it. Before filing, please take note on what to expect when opening an issue.

Note: Provide detailed information and relevant screenshots to make it easier for them to solve your issue.

Was this helpful? If so, please accept this answer as “Solution”. If you need additional assistance, reply here within 2 business days and I’ll be happy to help.

EtienneC

I tested many things, especially whether this might be influenced by the bucket and corpus region locations, but it turned out not be the case (at least I could not figure a pattern : mixing EU & US once worked!). At the end of the day, I have some combinations of buckets / corpora which just work (in the sense that I can import files) so on my side I'll consider it solved.

Another slightly annoying thing I noticed is that the import of a relatively large file e.g. a json close to 1mo can timeout in Python but actually continue to run while - in my case - it ends up doing nothing after > 1h. Meanwhile no import in the corpus is possible (an error message indicates that an operation is already ongoing in the corpus)... I can see the corresponding activity in the Vertex AI Dashboard & Cloud logging but could not find any way to stop the import operation. Again, not blocking, I just make sure to split my documents in small enough pieces.

yashD18

Same problem here! Tried creating a new corpus, different embedding model, but still getting the same error "Failed to create embedding quota checker after 60s. Global embedding quota pool are out of quota"