Continuing chats with uploaded file

hyperb1iss · 06-25-2024 02:26 PM

Hi! I'm using the Gemini Python SDK to transcribe a long audio file. My system prompt includes instructions for a JSON output format.

I first upload the audio file using genai.upload_file (I get 429 Resource Exhausted if I simply base64 encode it), which returns a URL. I then start a new chat conversation and get the response. If the output goes over the token limit, I send another message with the prompt "continue". This *should* continue generating from where it left off, but instead it always starts generating at the beginning! This works fine from the cloud console, and I get the full output.

I'm using the generativeai Python SDK 0.7.0. What could be going wrong??

Here is my code:

    def run(self) -> str:
        audio = self._upload_to_gemini(self._audio_file, mime_type="audio/wav")

        model = genai.GenerativeModel(
            model_name="gemini-1.5-flash",
            generation_config={
                    "temperature": 0,
                    "max_output_tokens": 8192,
                    "response_mime_type": "text/plain",
            },
            system_instruction=self._system_prompt,
        )

        transcription = ""
        chat = model.start_chat(history=[])
        response = chat.send_message(["Transcribe this file", audio], stream=True)
        print("*** response:")
        for chunk in response:
            print(chunk.text, end='')
            transcription += chunk.text

        while not transcription.strip().splitlines()[-1] == "}":
            print("*** asking for next segment..")
            print(chat)
            response = chat.send_message("continue", stream=True)
            print("*** response:")
            for chunk in response:
                print(chunk.text, end='')
                transcription += chunk.text

        return transcription