Hello,
I'm experiencing a persistent, critical failure with the Vertex AI API and I'm hoping to find if others have this issue or if a Googler can investigate. All requests to the generateContent endpoint are failing.
System Details:
Project ID: curious-nucleus-440815-i6
API Endpoint: europe-west2-aiplatform.googleapis.com
Model: gemini-1.5-pro-002
Error Codes: The primary error is RESOURCE_EXHAUSTED (Code 8). This is causing secondary 429 (Too Many Requests) and 504 (Gateway Timeout) errors.
Comprehensive Troubleshooting Steps Performed:
I have worked extensively to debug this and have ruled out all client-side causes:
Quota Verification: Confirmed via the GCP console that all Vertex AI API quotas are at 0% usage. This is not a project-level quota issue.
Request Complexity: The error occurs even on the most basic requests. As a final test, I reduced my API call's tool definition to a single, minimal function. The request still fails with the same error, proving the issue is not related to the complexity of my request.
No Widespread Outage: The Google Cloud Service Health dashboard shows no active incidents for Vertex AI.
Permissions Confirmed: The service account being used is active and has the required "Vertex AI User" role.
Persistent Issue: The issue has been ongoing for several hours, and our client's exponential backoff retry logic is unable to overcome it.
This appears to be a definitive internal, server-side fault specific to the infrastructure handling our project. Any insights or escalation help would be greatly appreciated.
Thank you.