Structured Output in vertexAI BatchPredictionJob - Page 2

davidfeiz · 01-19-2025 08:45 AM

The title is self-explanatory I guess, but I will try to specify my problem a little bit further.

In my use case I am trying to use batching for an evaluation pipeline, since the output is not required to be received in real-time. Further, bc my test-set is very large I run into rate-limits of the regular API (and run into higher cost as well).
Following the documentation, I can only specify the model and input/output locations like this

Using any additional parameter - like generation_config in the regular API - throws me errors. Also function calling does not seem to be possible, which could have served as a workaround as used for previous models. The documentation does not mention anything about this nor do I find this discussed anywhere.
I also have to stress that I explicitly do not want to just validate my output afterwards (which is implemented for redundancy), but to implement this into the response generation step to begin with, making sure the evaluation pipeline is configured in the same way as the dev/production pipeline.

If this is not a current feature, how can batch predictions even be used sensibly (for anything beyond a small PoC), considering structured outputs are the only reliable way to make LLM outputs adhere to a specific format?

And as a side-note: with OpenAIs API this is possible.