Re: Issue with google Speech To Text API

fgmustang · 10-15-2024 05:59 PM

I am working with a python script that is made from the code below that is provided in the google documentation.
Transcribe audio from streaming input | Cloud Speech-to-Text Documentation | Google Cloud
When I add my google cloud credentials and use en-US as my language_code, it works perfectly. I have used it for over an hour straight without any issues. When I change the language_code to any of the following es-MX, es-US, es-VE, I begin to have the same issue. Sometime within 15 seconds to 10 minutes it will fail. It usually fails at the 4 minute mark, because that is what the streaming_limit is set to. It will display the spanish transcription up until the time it fails. I have found if I just speak short sentences, and take a couple second break between them, it wont fail. Its almost like at the 4 minute mark if it can't process the final few sentences in time, it crashes. Its also weird that it doesn't do this for english, I have tested this so much over the past 2 weeks with no issue in english.

Below is what google cloud shows online.

NameRequestsErrors (%)Latency, median (ms)Latency, 95% (ms)


Cloud Speech-to-Text API	21	71	330,382	516,222

I usually do not get an error message if it stops working on during a New Stream Request. Below is what it looks like even though spanish is still being spoken.

240000: NEW REQUEST

480000: NEW REQUEST

Here is the error message I am getting inside the application if it fails before the new request. This has only happened 1 time.

Traceback (most recent call last):
File "C:\Users\abc123\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\google\api_core\grpc_helpers.py", line 112, in __next__
return next(self._wrapped)
File "C:\Users\abc123\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\grpc\_channel.py", line 543, in __next__
return self._next()
File "C:\Users\abc123\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\grpc\_channel.py", line 969, in _next
raise self
grpc._channel._MultiThreadedRendezvous: <_MultiThreadedRendezvous of RPC that terminated with:
status = StatusCode.INTERNAL
details = "Internal error encountered."
debug_error_string = "UNKNOWN:Error received from peer ipv4:142.250.138.95:443 {grpc_message:"Internal error encountered.", grpc_status:13, created_time:"2024-10-15T23:56:40.7689232+00:00"}"
>

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "c:\Users\abc123\Desktop\GoogleCloudTTS\GoogleSST.py", line 332, in <module>
main()
File "c:\Users\abc123\Desktop\GoogleCloudTTS\GoogleSST.py", line 317, in main
listen_print_loop(responses, stream)
File "c:\Users\abc123\Desktop\GoogleCloudTTS\GoogleSTT.py", line 221, in listen_print_loop
for response in responses:
File "C:\Users\abc123\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\google\api_core\grpc_helpers.py", line 115, in __next__
raise exceptions.from_grpc_error(exc) from exc

dawnberdan

Hi @fgmustang,

Welcome to Google Cloud Community!

The issue you're facing with your Python script for Google Cloud Speech-to-Text appears to be specific to processing Spanish audio.

Here’s a summary of the problem:

Your script functions perfectly with the English (en-US) language code.
When you switch to Spanish codes (es-MX, es-US, es-VE), it intermittently fails, often around the 4-minute mark (streaming_limit), but can occur anywhere from 15 seconds to 10 minutes.
Short sentences with pauses tend to perform better in Spanish.
You occasionally see the error message "Internal error encountered."

Possible Cause:

Limited Support for Spanish Dialects: Google STT may have restrictions in processing certain Spanish accents or dialects compared to English, leading to increased processing times and potential timeouts.
Network Issues: An unstable internet connection could interrupt the streaming process.
Resource Constraints: While unlikely based on your usage, limited resources on Google’s end might result in occasional errors.

Potential Solution:

Simplify Audio Input: Aim to reduce the complexity of your Spanish audio. Speak clearly, minimize background noise, and use shorter sentences with breaks if possible.
Adjust Streaming Limit: If failures often occur around the 4-minute mark, try increasing the streaming_limit in your script to allow for more processing time.
Ensure Stable Network Connectivity: Check that you have a reliable internet connection to reduce potential interruptions.
Monitor Google Cloud Status: Regularly check Google Cloud's status dashboard for any reported issues with Speech-to-Text services.

Additional Tips:

Some audio inputs may contain multiple languages, e.g. "Hinglish" (Hindi and English) or "Spanglish" (Spanish and English). While Speech does not officially support such inputs, oftentimes the models are able to understand.
Review the Google Cloud documentation for any known limitations related to Spanish language codes in Speech-to-Text.
If the issue persists, I suggest filling a feature request so that our Engineering Team can look into it. Note that there’s no definite date as to when this will be implemented. You may keep an eye on the release notes for any latest updates or new features related to Speech to text.

I hope the above information is helpful.

fgmustang

Hi @dawnberdan,

I appreciate your reply. I did try all the potential solutions before I posted this issue. The audio input is great quality with no background noise. I have tried multiple speakers that are speaking clearly and correctly. I have tried 3 different computers, along with 3 different audio setups just to make sure that wasn't the issue. I would love to adjust the streaming limit. I have tried a 2 minute limit with the same issue, but I cant really go past the 4 minute limit, due to Google online streaming doesn't allow you to connect for longer than 5 minutes, so you have to continually reconnect. At least that's according to their documentation which I linked in my post. I also tried internet at 3 separate locations, all of them being highspeed internet without latency and they all behaved the same way. I just checked Googled Cloud's status dashboard history and their were no issues with speech to text.

Again thank you for looking at this post and giving me some potential solutions.