gemini-2.5-flash-preview-05-20 support batch predications now (2025-05-21), doc is here:
Gemini 2.5 Flash | Generative AI on Vertex AI | Google Cloud
I want to use gemini-2.5-flash batch api (Python) in Vertex AI, I want to turn off thinking (thinkbudget=0), How to turn it off in config?
Solved! Go to Solution.
GenerationConfig should be in input JSONL or the BigQuery table. See examples here: https://github.com/GoogleCloudPlatform/generative-ai/blob/main/gemini/batch-prediction/intro_batch_p...
Note that Gemini 2.5 Flash does not yet support batch prediction.
# Turn off thinking
response = client.models.generate_content(
model="gemini-2.5-flash-preview-05-20"
contents="What is AI?",
config=GenerateContentConfig(
thinking_config=ThinkingConfig(
thinking_budget=0,
)
),
)
You can add config in your batch input.
Please see more details about thinking config in https://github.com/GoogleCloudPlatform/generative-ai/blob/main/gemini/getting-started/intro_gemini_2...
vertex gemini batch api example
from google import genai
from google.genai.types import CreateBatchJobConfig, JobState, HttpOptions
client = genai.Client(http_options=HttpOptions(api_version="v1"))
# See the documentation: https://googleapis.github.io/python-genai/genai.html#genai.batches.Batches.create
job = client.batches.create(
model="gemini-2.5-flash-preview-05-20",
# Source link: https://storage.cloud.google.com/cloud-samples-data/batch/prompt_for_batch_gemini_predict.jsonl
src="gs://cloud-samples-data/batch/prompt_for_batch_gemini_predict.jsonl",
config=CreateBatchJobConfig(dest=output_uri),
)
it is using CreateBatchJobConfig, it has same thinkingconfig as GenerateContentConfig ?
GenerationConfig should be in input JSONL or the BigQuery table. See examples here: https://github.com/GoogleCloudPlatform/generative-ai/blob/main/gemini/batch-prediction/intro_batch_p...
Note that Gemini 2.5 Flash does not yet support batch prediction.
User | Count |
---|---|
2 | |
1 | |
1 | |
1 | |
1 |