I have deployed Seasme csm 1b on vertex AI, and I think the output should be audio. I'm not getting it in that format. Also, I don't know what the input should be.
Hello @king7475,
If you’ve deployed SeamlessM4T (Seasme CSM 1B) on Google Cloud Vertex AI but aren’t receiving audio output, the issue likely stems from:
To resolve this, ensure your input matches the expected format (text for TTS, audio for S2S) and explicitly request speech output in your API call. If Vertex AI still returns text, additional configuration or a vocoder may be needed.
I hope this helped! 🙂
Best regards,
Suwarna
Hi @king7475 ,
Thank you for the question! Below you can find an example of how to get prediction from Sesame CSM deployed on Google Cloud Vertex AI.
from google.cloud import aiplatform
from IPython.core.display import display
from IPython.display import Audio
import base64
instances = [
{"speaker": 0, "text": "I just won a million dollar lottery."},
{"speaker": 1, "text": "You're kidding me!"},
]
seasme_endpoint =aiplatform.Endpoint(projects/{your-project-id}/locations/{your-endpoint-region}/endpoints/{your-endpoint-id}')
response =seasme_endpoint.predict(
instances=instances,
)
for prediction in response.predictions:
display(Audio(base64.b64decode(prediction["audio"])))
Hi king7475,
Welcome to the Google Cloud Community!
In addition to @SuwarnaKale and @ilnardo92 ’s input. Here are a few approaches to consider:
Was this helpful? If so, please accept this answer as “Solution”. If you need additional assistance, reply here within 2 business days and I’ll be happy to help.
User | Count |
---|---|
2 | |
2 | |
1 | |
1 | |
1 |