Re: Multiple Neural2 Speakers with text-to-speech ...

kolinkoehl · 03-16-2023 10:06 AM

Hello,

I'm trying to use the text-to-speech API to generate a multi speaker audio file. When I use the older wavenet voices it works just fine. But when I replace the speakers with the newer Neural2 models I get a 400 error saying:

InvalidArgument: 400 Request contains an invalid argument.

How can I get this to work for multiple speakers using the newer models?

Here is a sample:
<speak>
<voice name="en-US-Neural2-J">
Hello, everyone! Welcome to today's podcast. I'm your host A, and joining me is my co-host, B.
</voice>
<voice name="en-US-Neural2-I">
Hi, A! It's great to be here. Today, we're going to discuss an interesting topic that's been making headlines recently.
</voice>
<voice name="en-US-Neural2-J">
That's right, B. We're talking about the collapse of Silicon Valley Bank, which was triggered by a massive online bank run.
</voice>
<voice name="en-US-Neural2-I">
Indeed, A. This bank run was unlike any other we've seen before, as it was primarily fueled by social media platforms and private chat groups.
</voice>
</speak>

This sample works when I replace en-US-Neural2 speakers with en-US-Wavenet.

Joevanie

Possible workarounds would be, converting the text into <500 bytes or sending the request into smaller pieces. This might be a related case

Multiple Neural2 Speakers with text-to-speech error.