Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

Gemini 1.5 Pro 002 overloaded?

Recently switched my application from using Gemini 1.0 Pro to Gemini 1.5 Pro 002 last week, but today we're getting this response:

"Gemini API operation failed: operation=models/gemini-1.5-pro-002:generateContent, status_code=503
The model is overloaded. Please try again later."

Is this an issue at the Google end? When I test the model in the control panel it seems to work without any problems. Any help or insight greatly appreciated

2 12 2,590
12 REPLIES 12

Same here. I'm currently developing my application. This issue has been occurring for a few weeks for me. For the last few days the problem has gotten worse. Before, about half of the requests seem to work; however, today I get this error every time I use the model. 

Seems to be less today, so I'm guessing it's at the Google end rather than my end as nothing has changed from our perspective

Hi, @Wingee

Yes, it seems like an issue on Google's side. Could you please create a support ticket in your Google Cloud account?


Regards,
Mokit

glen_yu
Google Developer Expert
Google Developer Expert

I've found gemini-1.5-pro-001 to be more stable (or I guess "consistent"?) in terms of performance and handling requests, so I would give that a try.  It's still Gemini 1.5 🙂

Thank you! I’ll give that a go!

Hi @Wingee,

Welcome to Google Cloud Community!

503 usually indicates that the service may be temporarily overloaded or down on Google's end. Please wait a bit and retry your request.  On the other hand, you may implement an exponential backoff retry mechanism in your application to handle these temporary failures or use a different model as an alternative. Additionally, monitoring your API usage can help identify patterns or recurring issues. If you're encountering this behavior consistently, you may contact Google Cloud Support for further investigation and check the requests under your project.

Was this helpful? If so, please accept this answer as “Solution”. If you need additional assistance, reply here within 2 business days and I’ll be happy to help.

Same here, and this almost make the my service can't work at all

Getting 503 with "The model is overloaded. Please try again later.", and status "UNAVAILABLE" for `gemini-1.5-flash-8b` for days now, that's a shame as my service relies on it almost fully.

Yo Bois
Can you give an update on this? 
I'm currently experiencing major outage of the Gemini flash model.

At least give me a status page/api i can see the current load on the models

Big kissses from sad developer

Use VertexAI. The migration progress might take some work but it's way more reliable. 

Hi! What do you mean by "use VertexAI"? Is there another way to call this model  than https://generativelanguage.googleapis.com/v1beta/models/gemini-1.5-pro-002 ?

Hi. Frankly,

Google has made it quite complicated for developers to use their AI. I asked in their Discord server and learned that Vertex AI is part of Google Cloud Platform. It’s an alternative way to host their AI models, designed specifically for enterprise use cases, which makes it more reliable. As far as I know, the pricing should be the same.
https://cloud.google.com/vertex-ai?hl=en