Re: Text-to-Speech

cyruszei · 03-18-2023 12:28 PM

how come to the Text-to-Speech API doesn't have enableWordTimeOffsets: true, but Speech-to-text has?

I mean if you go Text-to-Speech you might wanna know exactly what words are saying and when. This has to do with accessibility support. Would it be possible for the Text-to-Speech API to get timestamps on each word?

annianni

In this case try using any other text to speech AI that provides such features for both and not just speech to text
Moreover I have personally faced has some issues with accuracy and accent for the Google TTS AI, hence relying on an external AI application for the time being until they fix their bugs and update the voice modules for more natural sounding ones