chirp2 trying to auto detect language even one lan...

James23454454 · 03-05-2025 09:55 AM

Even after entering specific language in a supported region , chirp2 keeps on trying to auto detect and sometimes succeeds 
and sometimes considers the source to be of another language, setting language is just a preference, doesnt stop google from
trying to auto detect it?

config = cloud_speech.RecognitionConfig(
    auto_decoding_config=cloud_speech.AutoDetectDecodingConfig(),
    language_codes=["ta-IN"],  # Set language code to auto to detect language.
    model="chirp_2",
)

# noinspection PyTypeChecker
request = cloud_speech.RecognizeRequest(
    recognizer=f'projects/not-my-project-id/locations/us-central1/recognizers/_',
    config=config,
    content=audio_file.read(),
)

NorieRam

Hi @James23454454,

Welcome to Google Cloud Community!

Setting both auto_decoding_config and language_codes in your Google Cloud Speech-to-Text configuration makes language_codes act only as hints. Chirp 2 will still try to auto-detect the language, leading to inconsistent results.

To force a specific language "ta-IN" and disable auto-detection entirely, remove auto_decoding_config from your RecognitionConfig:

config = cloud_speech.RecognitionConfig(
    language_codes=["ta-IN"], 
    model="chirp_2",
)

This ensures the API only uses the language code you provided.

Here’s a helpful link on your Recognition Config.

Was this helpful? If so, please accept this answer as “Solution”. If you need additional assistance, reply here within 2 business days and I’ll be happy to help.

chirp2 trying to auto detect language even one language is given as language code