Gemini request: Microphone input that isn't just "...

danielrosehill · 01-07-2025 04:30 PM

This is one of those small UI details that becomes a big impediment to fluid use for those using the Gemini interface via the web UI.

So currently you can do speech to text in Gemini via the web UI by clicking on the microphone input in the prompt bay:

It worked relatively well, although Google's speech-to-text performance still lags quite a bit behind that of Whisper.

The issue, however, is that it's a one-shot microphone button! Once you've used it to capture your prompt with your voice, there's no way to click the button again.

Once you have used voice-to-text to capture your prompt, the microphone icon turns into the submit button.

There are lots of times when I will dictate a prompt, then stop the microphone capture because I think I've finished, but then decide that I want to add something back.

I think that this workflow is pretty natural for people using voice to write prompts and the constraint that the microphone input can be only used once seems perhaps like an unintentional UI aspect to me.

Gemini request: Microphone input that isn't just "one shot"