gemini-2.5-flash-preview-05-20 support batch predications now (2025-05-21), doc is here:
Gemini 2.5 Flash | Generative AI on Vertex AI | Google Cloud
I want to use gemini-2.5-flash batch api (Python) in Vertex AI, I want to turn off thinking (thinkbudget=0), How to turn it off in config?
Solved! Go to Solution.
GenerationConfig should be in input JSONL or the BigQuery table. See examples here: https://github.com/GoogleCloudPlatform/generative-ai/blob/main/gemini/batch-prediction/intro_batch_p...
Note that Gemini 2.5 Flash does not yet support batch prediction.