Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

2.0 API Quota Error Despite 0% Usage in Console

Hello Google Cloud Community!

I'm trying to implement a simple video generation interface in Google Colab using Google's Veo 2.0 model via the Vertex AI REST API, but I've hit a puzzling roadblock.

According to my Google Cloud Console, I have a quota of 10 requests per minute for the veo-2.0-generate-001model in us-central1, and my current usage shows 0%. However, every single API request fails with a 429 error:

 
Quota exceeded for aiplatform.googleapis.com/online_prediction_requests_per_base_model with base model: veo-2.0-generate-001. Please submit a quota increase request.

My implementation:

I'm particularly confused because:

  1. The console shows 0% quota usage
  2. I haven't successfully made a single request yet
  3. I get the quota error immediately on the first attempt

Has anyone else experienced this discrepancy between reported quota usage and actual API behavior with newer generative models like Veo? Is there perhaps a hidden quota limit or some initialization step I'm missing?

Any guidance would be greatly appreciated!

Thanks in advance,
-J

1 1 217
1 REPLY 1

Hi no_notes,

Welcome to the Google Cloud Community!

The 429 RESOURCE_EXHAUSTED error you're encountering with Veo 2.0, even with the console showing 0% quota usage, is a known issue with newer models on Vertex AI. Here are some approaches that  might help to resolve your issue:

  • Although your dashboard displays a quota of 10 requests per minute, Veo 2.0 may have undocumented or extremely low initial quotas—potentially even below one request. Double-check the quota page specifically for veo-2.0-generate-001 in us-central1, and look for any related limits like concurrent requests that could be restricting access.
  • Based on the error message, you might need to request a quota increase using this resource. As it may help to resolve your issue.
  • Double-check that you are sending requests to the correct regional endpoint (us-central1). A simple typo in the URL could cause this error.
  • Make sure that the service account or user account used in your Colab notebook has the necessary IAM roles such as "Vertex AI User" to access Vertex AI and the Veo 2.0 model, and also confirm that the Vertex AI API is enabled for your Google Cloud project in the API & Services section of the console.
  • You may implement a retry mechanism with exponential backoff. as this is essential for managing rate limits effectively.

Was this helpful? If so, please accept this answer as “Solution”. If you need additional assistance, reply here within 2 business days and I’ll be happy to help.