Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

Vertex AI agent response using Gemini-1.5-flash truncated to 512 tokens

I am usinging the Vertex AI Agent Builder to create a playbook with agents and tools. I am getting good results but the output is truncated to 512 tokens even when I set the output token limit to 1024 tokens. My input token limit is set to 8K and my prompt input is < 8K. 

Please find attached screenshot of the settings of my playbook on the left and the truncated output on the right.

* Am I missing any settings?

* I checked the pricing (https://ai.google.dev/pricing) and don't see any limits specified even for the free version and I seem to be well within 8K for my input and output.

Any insights would be appreciated.

vertexai_agent.png

Solved Solved
1 3 2,052
1 ACCEPTED SOLUTION

It sounds like you're encountering a token truncation issue, which may stem from a few potential causes despite setting your output limit to 1024 tokens. Here are some factors you can investigate:-

1. System-Level Token Cap

2. Tool-Specific Token Management

3. Post-Processing Steps

4. Playbook Agent Setting Overrides

View solution in original post

3 REPLIES 3

It sounds like you're encountering a token truncation issue, which may stem from a few potential causes despite setting your output limit to 1024 tokens. Here are some factors you can investigate:-

1. System-Level Token Cap

2. Tool-Specific Token Management

3. Post-Processing Steps

4. Playbook Agent Setting Overrides

Thanks for your response @sahilnaircool. It was helpful as it made me realize I had selected different gen AI models on my tools. Probably prevented me from setting one output token limit across them. It is now working.

Your Welcome!