Even after entering specific language in a supported region , chirp2 keeps on trying to auto detect and sometimes succeeds
and sometimes considers the source to be of another language, setting language is just a preference, doesnt stop google from
trying to auto detect it?
config = cloud_speech.RecognitionConfig(
auto_decoding_config=cloud_speech.AutoDetectDecodingConfig(),
language_codes=["ta-IN"], # Set language code to auto to detect language.
model="chirp_2",
)
# noinspection PyTypeChecker
request = cloud_speech.RecognizeRequest(
recognizer=f'projects/not-my-project-id/locations/us-central1/recognizers/_',
config=config,
content=audio_file.read(),
)
Hi @James23454454,
Welcome to Google Cloud Community!
Setting both auto_decoding_config and language_codes in your Google Cloud Speech-to-Text configuration makes language_codes act only as hints. Chirp 2 will still try to auto-detect the language, leading to inconsistent results.
To force a specific language "ta-IN" and disable auto-detection entirely, remove auto_decoding_config from your RecognitionConfig:
config = cloud_speech.RecognitionConfig(
language_codes=["ta-IN"],
model="chirp_2",
)
This ensures the API only uses the language code you provided.
Here’s a helpful link on your Recognition Config.
Was this helpful? If so, please accept this answer as “Solution”. If you need additional assistance, reply here within 2 business days and I’ll be happy to help.