Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

429 Resource exhausted on VertexAI with just one large request

Hi folks, I'm trying to pass a prompt to Gemini (1.5 Pro or 1.5 Flash, same problem). It's a single unit prompt, I don't send several, and it's the first one of the day that I try to send.

And I get an error: google.api_core.exceptions.ResourceExhausted: 429 Resource exhausted. Please try again later. Please refer to https://cloud.google.com/vertex-ai/generative-ai/docs/quotas#error-code-429 for more details.

What makes it special, though, is its size:
Prompt Token Count: 605995 # total_tokens
Prompt Character Count: 1294 # total_billable_characters

It's made up as follows:
contents = [
    foo, # Part.from_data
    bar, # Part.from_data
    prompt # Part.from_text
]

I've checked the quotas, and none of them seem to be breached. And the input token quota is about 4 million, so I'm well under.

Do you have any idea what might be causing my problem? Thanks!

Solved Solved
3 6 1,817
1 ACCEPTED SOLUTION

We recieved information that the issue is because of resource problems due to a new way to handle quotas and requests. This will happen for version 002 (when many users try to use 002 at the same time, for the specific region), but 001 will still work as it uses the old way to handle quotas and requests.
Other ways to solve it is to try another region (if pay-as-you-go) or buy dedicated GSU's via Provisioned Throughput.

View solution in original post

6 REPLIES 6

We also got the same issue with Gemini 1.5 Flash, with even smaller files.

We are getting the same issue as well.

We recieved information that the issue is because of resource problems due to a new way to handle quotas and requests. This will happen for version 002 (when many users try to use 002 at the same time, for the specific region), but 001 will still work as it uses the old way to handle quotas and requests.
Other ways to solve it is to try another region (if pay-as-you-go) or buy dedicated GSU's via Provisioned Throughput.

This has indeed solved my problem, thank you very much!

Is there a better solution? Because in my opinion, downgrading the version isn't ideal, even if it's better than nothing for now.

Responded below, the only solution at the moment is buying dedicated GSU's via Provisioned Throughput. Maybe a Googlers can provide another solution?

In the meantime, issue solved.

Happy to help out!
Yes, it is not ideal to downgrade to 001, I know. It doesn't seem to be any better solution - we only have the pay-as-you-go (with risk of getting Resource exhausted/429) and dedicated GSU via Provisioned Throughput.

It would be great to have something in between I think, like the possibility to buy dedicated GSU for a specific time frame (1 day, 1 week, x days).