Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

Google Cloud Speech-to-Text V2 (Android): `SPEECH_ACTIVITY_END` not triggered with `ja-JP` but works

Environment:

  • Platform: Android (Java)
  • Google Cloud Speech-to-Text Version: V2
  • Model: chirp_2
  • Endpoint: us-central1-speech.googleapis.com:443

Description:

I’m using Google Cloud Speech-to-Text V2 within an Android app.
I'm encountering an issue where  'SPEECH_ACTIVITY_END'  is not triggered when using the 'ja-JP'  language code, but it works as expected with 
 'en-US'

Here is the relevant code snippet for the streaming recognition features:

protected StreamingRecognitionFeatures clientFeaturesOf(GoogleV2AISpeechConfig config) {
    return config.getGoogleV2().getFeatures().apply(
        StreamingRecognitionFeatures.newBuilder()
            .setInterimResults(true)
            .setEnableVoiceActivityEvents(true)
            .setVoiceActivityTimeout(
                StreamingRecognitionFeatures.VoiceActivityTimeout.newBuilder()
                    .setSpeechEndTimeout(Duration.newBuilder().setSeconds(3).build())
                    .build()
            )
            .build()
    );
}

With languageCode = "en-US" The response stream includes the expected events

SpeechEventType: SPEECH_ACTIVITY_BEGIN  
SpeechEventType: SPEECH_EVENT_TYPE_UNSPECIFIED  
SpeechEventType: SPEECH_EVENT_TYPE_UNSPECIFIED  
...
SpeechEventType: SPEECH_ACTIVITY_END

With languageCode = "ja-JP" Everything else remains the same (code, endpoint, model, audio config, etc.), except that the audio content is in Japanese. However, in this case, I do not receive the SPEECH_ACTIVITY_END event at all. The sequence looks like:

SpeechEventType: SPEECH_ACTIVITY_BEGIN  
SpeechEventType: SPEECH_EVENT_TYPE_UNSPECIFIED  
SpeechEventType: SPEECH_EVENT_TYPE_UNSPECIFIED
...  
// No SPEECH_ACTIVITY_END

Question:

Is this an expected limitation or behavior for certain languages (like Japanese)?

Is SPEECH_ACTIVITY_END supported only for some language models or locales?

Is there anything I should change in the configuration when using ja-JP?

Any suggestions or insights are greatly appreciated.

0 0 114
0 REPLIES 0