Gemini Pro and Flash 002 suddenly shorter context window

Hello,

I can no longer use the Vertex AI API for Gemini models with long context, this is the error I get:

run with [gemini-1.5-pro-002] failed:

Unable to submit request because the input token count is 53163 but model only supports up to 32768. Reduce the input token count and try again. You can also use the CountTokens API to calculate prompt token count and billable characters. Learn more: https://cloud.google.com/vertex-ai/generative-ai/docs/learn/models

The same code and same models used to work (as expected, as per documentation Pro and Flash 002 should have context of >= 1M tokens).

I wonder if I am enrolled in some live experiment to some model that only supports short context windows (like the most recent gemini exp, which supports only 32K context window).

2 7 8,760

7 REPLIES 7

never-displayed