Solved: textembedding-gecko Quota

stars · 08-25-2023 07:54 PM

I keep receiving 429 Quota exceeded errors when trying to create embeddings. Looking at my quotas, I can see "online_prediction_requests_per_base_model" is limited to 5/minute.

This seems to contradict this page, which suggests the limit should be 600. https://cloud.google.com/vertex-ai/docs/quotas

Is there a reason why I cannot receive a higher quota?

Many thanks

lsolatorio

Hi @stars,

Welcome and thank you for reaching out to our community.

I understand that our documentation can be confusing at times but let me help you get a better picture of our quotas and limits.

The base_model:textembedding-gecko indeed has 600 requests per minute quota but it is limited to 5 input text per request. This means that you can have a maximum of 600 request instances per minute with a maximum of 5 input text for each request, as shown in the screenshot that you have provided.

Please do note that you can also reach out to Vertex AI Support to discuss more of this in detail.

View solution in original post

lsolatorio

Hi @stars,

Welcome and thank you for reaching out to our community.

I understand that our documentation can be confusing at times but let me help you get a better picture of our quotas and limits.

The base_model:textembedding-gecko indeed has 600 requests per minute quota but it is limited to 5 input text per request. This means that you can have a maximum of 600 request instances per minute with a maximum of 5 input text for each request, as shown in the screenshot that you have provided.

Please do note that you can also reach out to Vertex AI Support to discuss more of this in detail.

stars

Thank you for the clarification! 😀

glaforge

I've just discovered that the new limit now seems to be 250 input texts per request, compared to 5 before.