Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

Clarification on How response_schema Injects JSON Schema in the prompt.

Hi everyone,

I'm currently fine-tuning gemini for structured output and have hit a snag regarding consistency between fine-tuning and inference. I’d appreciate some insights on the following points:

Background:

  • Fine-Tuning: During fine-tuning, I include the JSON schema directly in the system prompt. For example (pseudo-prompt):
         System: You are a helpful AI assistant that always responds with the following JSON schema:
            {
             "type": "object",
             "properties": { "recipe_name": { "type": "string" } },
             "required": ["recipe_name"]
            }
         User: List a few popular cookie recipes.
         Model: <Output that adheres to the schema>
  • Inference:  At inference time, I have the option to use the response_schema parameter. The API then automatically injects the JSON schema into the system prompt along with additional operations, likely constraint decoding. However, I don’t have visibility into the exact format of this injected prompt.

My Question: 

How exactly does the response_schema parameter integrate the JSON schema into the prompt during inference? Is there any documentation or method to inspect the exact injected prompt? Given that i need to embed the JSON schema directly during fine-tuning, i want to ensure that my fine-tuning prompts are consistent with the inference-time behavior where the schema is injected automatically. Any best practices or insights to align both stages would be extremely helpful.

0 2 542
2 REPLIES 2