Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

Integrate Gemini 2.0 Live Api with Phone Provider (Vonage, Twilio, etc)

Hey All,

I've been trying to experiment with the Gemini 2.0 Live Api connecting to a phone line, and I'm sort of surprised that Google chose the output audio format that they did (Raw 16 bit PCM audio at 24kHz little-endian). Twilio only supports 8-bit Mulaw and Vonage only supports 16 bit PCM at 16kHz, both of which require conversion/resampling. I've gotten stuff working...but we'll just say it's not ready for production. Vonage resampling is using a CLI tool called ffmpeg but it's spotty/slow for the realtime conversion and the Twilio version required a manual pcm -> mulaw conversion

Has anybody else gotten this working nicely? I've found this demo using a service called Daily which sets up a webRTC room and has twilio connect to that via SIP https://github.com/kkacquah/gemini-multimodal-example/blob/main/bot_runner.py

I have this working with OpenAI Realtime API + Twilio since OpenAI worked with Twilio on the launch and made sure there was compatibility. 

0 0 26
0 REPLIES 0