Announcements
This site is in read only until July 22 as we migrate to a new platform; refer to this community post for more details.
Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

AI LLM Gemini in Vertex AI - reached limit of payload size 10 MiB

Good time of a Day ! 
I encounter error when I try to save current prompt with 750 k tokens from 2 kk limit  or when I try to add image file with size more than 10 mb ( check attached screenshot ) : 
ScreenShot_20250127220230.jpeg
I have asked this question to My Google Cloud Account Executive , She told that I need to search quota "Generate content requests" for model I use .
There is no specific quota for payload size , only " minutes per project per region " . 
How to ask to increase payload limit so that I can use more media files when work and collaborate with AI LLM Gemini in Vertex AI ? 
All the Graces ^_^ ! ))

0 1 3,086
1 REPLY 1

Hi @ArhmagosBasaroS,

Welcome to Google Cloud Community!

The error you're encountering isn't specifically due to a "payload size" limit, but rather a restriction on the resources used by a single request to the Gemini model. Large prompts with many tokens and large images demand more processing power, which can exceed the available quotas. Your account executive correctly identified the "Generate content requests" quota as the key factor. There isn't a single "payload size" setting that can be adjusted. When requesting an increase avoid emphasizing a payload limit. Instead, focus on the consumption of resources. 

  • Clarify the Quota Limits: Since the Account Executive mentioned the "Generate content requests" quota, ensure you fully understand how this quota is configured for the specific model and project you're using. Confirm whether any related quotas, such as memory or processing power, might also be impacting your ability to handle larger payloads or file sizes.
  • Request Specific Quota Increases: If you're unable to find a specific "payload size" quota, ask Google Cloud Support for a quota increase that addresses larger file uploads and higher token limits. You can frame this request around resource consumption, such as needing more processing capacity or storage to handle larger prompts and media files.
  • Check Current Resource Utilization: Evaluate if your current usage of resources (memory, processing, etc.) is reaching its limits. Google Cloud might have monitoring tools or logs that can help you pinpoint which resource is being maxed out. This will help you make a stronger case for the quota increase.
  • Optimize Payload Size: If increasing quotas is not an immediate option, consider optimizing how you work with the payload. For example, breaking large media files into smaller chunks or reducing the token count per prompt might allow you to continue working within current limits.
  • Justify the Increase: Explain the business value. How does handling larger prompts and images improve efficiency, productivity, or the quality of your work.
  • Consider Regional Quotas: Remember that quotas often apply per region. Specify the region you're working.

Once you've confirmed that all the necessary steps for the increase have been completed, you can reach out to Google support and work with your account executive to submit the official quota increase request.

Was this helpful? If so, please accept this answer as “Solution”. If you need additional assistance, reply here within 2 business days and I’ll be happy to help.