In March, I wrote about how 90% of the voices for Google Text-to-Speech suddenly disappeared. Now, there are only a few Chirp voices available.
A moderator here pointed me to the issue tracker page for it, but since then there have been ZERO updates from Google.
It's been over a month. Does Google not plan to bring back the other voices? This is severely affecting our production, and the complete lack of updates is very frustrating.
Thank you for any help you can provide.
We have not removed any voices.
While they are not visible in the UI, all voices are accessible via the Text-to-Speech API. This was an intentional choice by the product team to only show Chirp 3 HD voices in the Cloud Console UI for Text-to-Speech.
@rmrf Thank you for your answer. May I ask why they chose to do that? We only use the console, not the API, and this is a major problem for us.
Is there any chance that all the voices will return to the Cloud Console UI for Text-to-Speech?
This was a Google leadership decision. I am gathering evidence to Inform the product team and leadership about customer sentiment on this change, so the feedback is not going into the void.
If this is a severe issue that you cannot work around, I suggest bringing up your account team to escalate to product engineering. After reviewing and engaging in internal discussions I have concluded that there is not a good chance that the voices will return to the Cloud Console UI for Text-to-Speech and customers will need to rely on generating speech samples using the officially supported API.
You can get example code generated using Gemini LLM with a prompt like:
Can you create a nodejs commandline app that uses my gcloud credentials to send arg $1 string as text to be converted to speech via Google text to speech?
And a suitable example at https://cloud.google.com/text-to-speech/docs/libraries#use
Please let me know if that code is not functional or provide feedback using the feedback thumbs up/down at the top of the page.
@rmrf Thank you for the detailed answer. Would it be possible for you to explain a bit more about the API? For example, where and how do I use it? I understand that I can generate an example code with Gemini LLM (thank you for the prompt), but after I get the code, what do I do with it?
I apologize for asking such an incredibly basic question, but we have absolutely no engineers on our content team, and I don't really know anything about APIs. We always used the Cloud Console UI because we could just input the text, choose the language/voice/speed, and then download the created .wav file.
https://www.googlecloudcommunity.com/gc/AI-ML/Text-To-Speech-Voice-Options-Unavailable/m-p/889232/hi... shows some code to run in the free Cloud Shell (the icon in the upper right of the Cloud Console.
Alternatively you can use the API Explorer. I made an example, where if you click on the right you can change the voice name and the text you want to synthesize.
Click Execute and give it permission.
Then take the response text containing "audioContent" and paste into the top pane of https://gchq.github.io/CyberChef/#recipe=From_Base64('A-Za-z0-9%2B/%3D',true,false) and click the download icon.
In the "Please enter a filename:" dialog, change the name of the file to be anything ending in .mp3 and it will be saved to your machine.
@rmrf Thank you again for the detailed explanation. May I confirm one more thing about the API for Google Text-to-Speech?
I apologize if this is a stupid question, but I'm concerned about calculating cost. With the console, I could simply count the number of characters that I input. However, when I use code with the API, should I be considering the number of characters for the entire code, or can I continue to simply count the number of characters only for the actual text that will be turned into audio?
No, thank you for reminding me that not everyone is a developer!
It's just the characters entered.
I am going to appeal to product leadership one more time given the ongoing negative experience customers are having with the change.
@rmrf Thank you so much for all your help! Best of luck appealing to them one more time.