Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

Resource Exhaustion Errors (429) when using grounding via google_search tool

I am on a paid plan and have been using the grounding feature with gemini-2.0-flash (

config["tools"] = [{"google_search": {}}]). It was working fine all day and then I got a solid wall of 429s. Figured I must have hit a quota but I'm not even close on any of them.
Screenshot 2025-02-20 at 8.09.56 PM.png
 
So I looked for any documentation I could find on limits for grounding and found this
Screenshot 2025-02-20 at 8.30.03 PM.png
I'm on a pid plan so I don't understand why I wouldn't just start incurring charges if I go over 1,500 RPD. 

I can confirm nothing has changed in my code and non grounding calls continue to work fine.
0 2 200
2 REPLIES 2

Hi @AA-Ron47382892 ,

Welcome to the Google Cloud Community!

The "429 RESOURCE_EXHAUSTED" error you encountered with the gemini-2.0-flash model generally indicates that the number of requests has exceeded the allocated capacity. Even if your overall quota hasn't been reached, you might be surpassing the rate limit for a specific operation, making too many requests that could lead to hitting rate limits. Given that Google Cloud APIs are subject to resource limits, implementing retries with exponential backoff can help handle transient errors when you're close to hitting those limits. For more information about the error code 429, you may read this documentation

Regarding your question about charging after exceeding 1500 (RPD), according to the documentation, you will be charged $35 per 1,000 requests. Given your current usage of 1,504 RPD, it seems that you haven't exceeded the threshold for the additional charge.

If the issue persists, I suggest contacting Google Cloud Support as they can provide more insights to see if the behavior you've encountered is a known issue or specific to your project.

Was this helpful? If so, please accept this answer as “Solution”. If you need additional assistance, reply here within 2 business days and I’ll be happy to help.

Thank you for the response.


Given that Google Cloud APIs are subject to resource limits, implementing retries with exponential backoff can help handle transient errors when you're close to hitting those limits.

Exponential backoffs were in place when I was getting 429s.


Even if your overall quota hasn't been reached, you might be surpassing the rate limit for a specific operation

Can you point me to where I can see the rate limits for the specific operations that could lead to the 429s? I am using the genai sdk and making less than 1 call /minute. At that low rate I'm not sure what rate limits I might be hitting and nothing is showing up in my console as being exceeded.

Thanks,