2.0 API Quota Error Despite 0% Usage in Console - Page 2

no_notes · 04-14-2025 02:17 AM

Hello Google Cloud Community!

I'm trying to implement a simple video generation interface in Google Colab using Google's Veo 2.0 model via the Vertex AI REST API, but I've hit a puzzling roadblock.

According to my Google Cloud Console, I have a quota of 10 requests per minute for the veo-2.0-generate-001model in us-central1, and my current usage shows 0%. However, every single API request fails with a 429 error:

Quota exceeded for aiplatform.googleapis.com/online_prediction_requests_per_base_model with base model: veo-2.0-generate-001. Please submit a quota increase request.

My implementation:

Uses google.colab.auth for authentication
Fetches an access token via gcloud
Creates a properly formatted request with a base64-encoded image and prompt
Sends the request to https://us-central1-aiplatform.googleapis.com/v1/projects/[PROJECT_ID]/locations/us-central1/publish...

I'm particularly confused because:

The console shows 0% quota usage
I haven't successfully made a single request yet
I get the quota error immediately on the first attempt

Has anyone else experienced this discrepancy between reported quota usage and actual API behavior with newer generative models like Veo? Is there perhaps a hidden quota limit or some initialization step I'm missing?

Any guidance would be greatly appreciated!

Thanks in advance,
-J