Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

Google Cloud Speech to Text V2 streaming audio feed from a microphone

spy
Bronze 1
Bronze 1

I'm running speech-to-text stream on an Android device with microphone input, and it works smoothly in V1.

Here is a tutorial from V1
https://cloud.google.com/speech-to-text/docs/transcribe-streaming-audio

I used the StreamingRecognizeRequest and set a ResponseObserver as a callback, the final transcripts would be return.

However I tried to migrate the code to V2, it could not work properly.
Here is my code (Java).

ResponseObserver<StreamingRecognizeResponse> responseObserver = new ResponseObserver<>() {
@Override
public void onStart(StreamController controller) {
Log.d(TAG, "onStart = " + controller);
}

@Override
public void onResponse(StreamingRecognizeResponse response) {
Log.d(TAG, "onResponse = ");
}

@Override
public void onComplete() {
Log.d(TAG, "onComplete = ");
}

@Override
public void onError(Throwable t) {
Log.d(TAG, "onError = " + t);
}
};

RecognitionConfig recognitionConfig = RecognitionConfig.newBuilder()
.addLanguageCodes("en-US")
.setAutoDecodingConfig(AutoDetectDecodingConfig.newBuilder().build())
.build();
StreamingRecognitionConfig streamingRecognitionConfig = StreamingRecognitionConfig.newBuilder()
.setConfig(recognitionConfig)
.build();
StreamingRecognizeRequest streamingRecognizeRequest = StreamingRecognizeRequest.newBuilder()
.setStreamingConfig(streamingRecognitionConfig)
.setRecognizer(recognizer.getName())
.build();
mClientStream = mSpeechClient.streamingRecognizeCallable().splitCall(responseObserver);
mClientStream.send(streamingRecognizeRequest);

// receive audio buffer continuously
if (mAudioEmitter != null) {
mAudioEmitter.start((ByteString bytes) -> {
StreamingRecognizeRequest.Builder sBuilder = StreamingRecognizeRequest.newBuilder().setRecognizerBytes(recognizer.getNameBytes())
.setAudio(bytes);
mClientStream.send(sBuilder.build());
});
}

I realized one of the differences between V1 and V2 is the Recognizer object, so I set the parameter and make sure it is right.
But it still cannot work, the onStart() method is called but no onResponse().

And there is no any sample about audio input (microphone) in V2 developer guides, they are all audio file recognition.

Is there any restriction on V2?

Thanks

2 4 4,001
4 REPLIES 4