Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

where to find the RPM for text embedding api?

Hi,

I can find clearly the rate limit for Gemini Flash 2.0 and Gemini Flash-lite 2.0. But I can't find anything on the rate limit like RPM for text embedding API calls.

There is a link to ask for the increase of RPM for Gemini Flash 2.0 or Lite but no place to ask for the rate limit increase on embedding. Is this intentional?

Any help?

Thanks,
Kai

0 1 1,048
1 REPLY 1

Hi @marketpredict,

Welcome to Google Cloud Community!

To address this, you can consider using the “base_model: textembedding-gecko” model within the Vertex AI API, as outlined in this documentation. Try experimenting with this option to see if it helps address your rate limiting queries. To check this, navigate to the Google Cloud Console, then in the left-hand navigation panel, click on "IAM & Admin" and select “Quotas & System Limits." You can use the Filter search box to look for the RPM, as shown in the screenshot below.

 Screenshot 2025-03-13 10.56.00 PM.png

If there is a need to increase the quota, you can request it from Google Cloud by following the steps in this documentation. Keep in mind that these requests are subject to review and approval and may take some time to process. Additionally, quota increase requests are typically evaluated based on the validity of the business case provided.

You can also visit the following documentation on the discussion of Text embeddings API:

For clarification, you can also contact Google Cloud Support to inquire about your specific needs for rate limit increases for the Text Embedding API.

Was this helpful? If so, please accept this answer as “Solution”. If you need additional assistance, reply here within 2 business days and I’ll be happy to help.