Hi there folks! Hoping to find an answer to a problem that's frustrating me to no end.
I'm leveraging OpenAPI schema definitions for generated output from gemini flash via API calls and Batch predictions. It's generally working well, however as my schemas become more complex/large I find that I am getting 400 errors with no useful context from them when running generative prompts.
The responses are 400 errors with 'INVALID_ARGUMENT' and errorDetails=undefined (which is unbelievably useless to debug).
Through a ton of anecdotal trial and error I've determined that it appears there might be some kind of unpublished hard limits around the size or depth of the schema you can supply. I've dug into docs and cannot find anything indicating such -- the input token count is still relatively small (I can copy the schema and ask for the exact output with the schema inline in VertexAI freeform and it works perfectly, with ~4000 input tokens). The JSON serialized schema is only about 8k characters long.
A few notes:
Thanks in advance anybody who can help!
As a follow up, I'm seeing this using the generative APIs (non Vetex...i.e. AI Studio) as well! Here's a post from the forum over there, with somebody having a similar issue
https://discuss.ai.google.dev/t/json-mode-internal-server-error-500/38123/2
It's really frustrating when things don't work as expected, especially with nested schemas and enums. I keep running into issues like you ...
Hi @ecarothers,
According to this documentation, a 400 HTTP error related to INVALID_ARGUMENT and FAILED_PRECONDITION
occurs when a request fails API validation or tries to access a model that requires allow listing or is restricted. This is often due to exceeding the model's input token limit. To resolve this, please refer to the Model API reference for Generative AI for details on request parameters and limits.
Here are some potential causes and solutions:
Additional Tips:
I hope the above information is helpful.
Thanks for the reply. I have done everything you suggested, all to no avail. I'm confident that my token counts are drastically under the input limits. Additionally I have thoroughly tested removing specific parts of the schema but there's no rhyme/reason to why that fails.
More importantly if there ARE limits to schema, I really think these need to be published! I have no idea if adding an additional ENUM val will break things, or a new top-level or nested key. This is even more important if the schema is customized across different calls (for example leveraging the API as part of a product where each customer can define their own set of ENUMS for a schema key). I have zero confidence in leveraging Gemini in production at this point due to absolutely no concrete documentation from Google on how I can/should expect this to function.
Hi @ecarothers,
I completely understand your frustration with the unclear schema limits. Since you've tried various configurations without success, it may be helpful to reach out to Google Cloud Support for the specific guidance you need, They can clarify any potential constraints. Also, keeping an eye on the release notes for the latest updates and features is a good idea. You might find relevant information in this documentation as well. In the meantime, simplifying your schema could be a useful temporary workaround while you gather more insights.
I hope the above information is helpful.
User | Count |
---|---|
2 | |
1 | |
1 | |
1 | |
1 |