Announcements
This site is in read only until July 22 as we migrate to a new platform; refer to this community post for more details.
Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

Incomplete Truncated response of Gemini Pro model

I'm encountering an issue with my code execution in a Jupyter notebook. I've set up the model object, and I'm running a list of prompts concurrently, the list size being less than the Gemini API's 60 QPM limit. However, I'm noticing that some responses are incomplete, particularly when the API outputs long code segments. Instead of optimizing these segments to show only necessary changes, the API often includes lengthy code outputs, which can reach the 2048-token output limit. This results in unhelpful responses that exhaust themselves by outputting excessive code segments. Other language models optimize code segments effectively, but I'm struggling with this problem. Any advice or solutions would be greatly appreciated. 

Here the prompt is being constructed by simply joining a code file with a prompt requesting ana 

Screenshot 2024-02-15 121033.png

3 7 10.1K
7 REPLIES 7

I am having the same issue, the response by the API stops in the middle for some reason. But when I use the same prompt in Google MakerSuite I am getting the full response.

Same here, changing the maximum output tokens simply truncates the response instead of being more or less verbose.

Screenshot 2024-03-20 at 1.10.24 PM.png

I noticed the same thing, but it randomly started happening since yesterday

I found what I was doing wrong, I was streaming the response and only parsing the first candidate, I disabled streaming and it started giving complete responses, hope this helps.

Explorer

Hey can you send the snippet of the code of how you disabled streaming

Reply with continue

It is aware of its limits. I just asked it to provide the output in smaller chunks. "Take the code I initially gave you. Instead of a single response with the output, limit your output to a few sections and functions." After the first output, it responded with, "I'll pause here. Let me know when you're ready for the next part! I'll continue providing the code in smaller sections." then, "Ready for the next batch?". Occasionally, I had to nudge it by typing Continue. It was able to make it through it all without truncating. I think it felt a little smug in the end by responding, "That's it!" Lol.