Unexpected 400 errors with Generated Output Schema

ecarothers · 09-17-2024 09:00 PM

Hi there folks! Hoping to find an answer to a problem that's frustrating me to no end.

I'm leveraging OpenAPI schema definitions for generated output from gemini flash via API calls and Batch predictions. It's generally working well, however as my schemas become more complex/large I find that I am getting 400 errors with no useful context from them when running generative prompts.

The responses are 400 errors with 'INVALID_ARGUMENT' and errorDetails=undefined (which is unbelievably useless to debug).

Through a ton of anecdotal trial and error I've determined that it appears there might be some kind of unpublished hard limits around the size or depth of the schema you can supply. I've dug into docs and cannot find anything indicating such -- the input token count is still relatively small (I can copy the schema and ask for the exact output with the schema inline in VertexAI freeform and it works perfectly, with ~4000 input tokens). The JSON serialized schema is only about 8k characters long.

A few notes:

My schemas are using deeply nested structures (~15 levels deep in some case)
There's a combination of arrays, objects, and lots of string ENUM values in my schema.
As noted above if I arbitrarily remove some of the schema it will work, but it doesn't really matter which parts I remove explicitly, so I know there's not an error in my schema.
I've run my schema through multiple validators and checked to verify I'm only using the subset of supported properties.

Thanks in advance anybody who can help!

ecarothers

As a follow up, I'm seeing this using the generative APIs (non Vetex...i.e. AI Studio) as well! Here's a post from the forum over there, with somebody having a similar issue

https://discuss.ai.google.dev/t/json-mode-internal-server-error-500/38123/2

miroblog

It's really frustrating when things don't work as expected, especially with nested schemas and enums. I keep running into issues like you ...

dawnberdan

Hi @ecarothers,

According to this documentation, a 400 HTTP error related to INVALID_ARGUMENT and FAILED_PRECONDITION occurs when a request fails API validation or tries to access a model that requires allow listing or is restricted. This is often due to exceeding the model's input token limit. To resolve this, please refer to the Model API reference for Generative AI for details on request parameters and limits.

Here are some potential causes and solutions:

Schema Complexity and Depth: Since you mentioned that simplifying the schema helps, consider breaking it into smaller, more manageable components. This can help identify if a specific part of the schema is causing the issue.
Token Limits:

Check input token count: Ensure you're within the allowed input token limit for Gemini Flash.
Optimize schema: Try to reduce the number of tokens required to represent your schema.

Schema Validation: Start with a minimal schema to validate your API calls. Gradually increase its complexity until the error reoccurs, allowing you to identify the specific structure that triggers the issue.
API Limitations: While your JSON serialized schema is 8k characters, consider the possibility of a limit on the number of fields or the depth of nesting. If you have access to any API documentation or support forums, it may be worth checking if there are any documented limits on schema sizes, such as the OpenAPI Specification and the Vertex AI API Documentation.

Additional Tips:

Experiment with different formats: Try using a different format for your schema (e.g., JSON Schema, YAML) to see if it resolves the issue.
Check for circular references: Ensure your schema doesn't contain circular references, which can cause parsing errors.
Monitor API logs: If possible, review the API logs to see if there are any specific error messages or hints about the problem.

I hope the above information is helpful.

ecarothers

Thanks for the reply. I have done everything you suggested, all to no avail. I'm confident that my token counts are drastically under the input limits. Additionally I have thoroughly tested removing specific parts of the schema but there's no rhyme/reason to why that fails.

More importantly if there ARE limits to schema, I really think these need to be published! I have no idea if adding an additional ENUM val will break things, or a new top-level or nested key. This is even more important if the schema is customized across different calls (for example leveraging the API as part of a product where each customer can define their own set of ENUMS for a schema key). I have zero confidence in leveraging Gemini in production at this point due to absolutely no concrete documentation from Google on how I can/should expect this to function.

dawnberdan

Hi @ecarothers,

I completely understand your frustration with the unclear schema limits. Since you've tried various configurations without success, it may be helpful to reach out to Google Cloud Support for the specific guidance you need, They can clarify any potential constraints. Also, keeping an eye on the release notes for the latest updates and features is a good idea. You might find relevant information in this documentation as well. In the meantime, simplifying your schema could be a useful temporary workaround while you gather more insights.

I hope the above information is helpful.