Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

textembedding-gecko Quota

I keep receiving 429 Quota exceeded errors when trying to create embeddings. Looking at my quotas, I can see "online_prediction_requests_per_base_model" is limited to 5/minute.

stars_0-1693018548923.png

This seems to contradict this page, which suggests the limit should be 600. https://cloud.google.com/vertex-ai/docs/quotas

Is there a reason why I cannot receive a higher quota?

Many thanks

Solved Solved
3 3 2,643
1 ACCEPTED SOLUTION

Hi @stars

Welcome and thank you for reaching out to our community.

I understand that our documentation can be confusing at times but let me help you get a better picture of our quotas and limits.

The base_model:textembedding-gecko indeed has 600 requests per minute quota but it is limited to 5 input text per request. This means that you can have a maximum of 600 request instances per minute with a maximum of 5 input text for each request, as shown in the screenshot that you have provided.

Please do note that you can also reach out to Vertex AI Support to discuss more of this in detail.

View solution in original post

3 REPLIES 3

Hi @stars

Welcome and thank you for reaching out to our community.

I understand that our documentation can be confusing at times but let me help you get a better picture of our quotas and limits.

The base_model:textembedding-gecko indeed has 600 requests per minute quota but it is limited to 5 input text per request. This means that you can have a maximum of 600 request instances per minute with a maximum of 5 input text for each request, as shown in the screenshot that you have provided.

Please do note that you can also reach out to Vertex AI Support to discuss more of this in detail.

Thank you for the clarification! 😀

 

I've just discovered that the new limit now seems to be 250 input texts per request, compared to 5 before.