Hi Everyone,
I'm tearing my hair out trying to get my first java speech to text transcript going. The code is below. The wav file uploads to the storage bucket fine, if I do the transcript manually using the web front end the transcript works, but my code below gives the error INVALID_ARGUMENT: Invalid resource field value in the request at the line speech.batchRecognizeOperationCallable().call(request);. Unfortunately I don't get any more information than that so I'm kinda debugging blind.
Any help would be greatly appreciated - I've hit a wall on this one.
private static String getTranscript(byte[] audio) throws IOException, ExecutionException, InterruptedException {
InputStream credentialsStream = TestCompressor.class.getResourceAsStream("/keys/google.json");
GoogleCredentials credentials = GoogleCredentials.fromStream(credentialsStream);
Storage storage = StorageOptions.newBuilder().setCredentials(credentials).build().getService();
BlobId blobId = BlobId.of("isaidusaid", "testfile.wav");
BlobInfo blobInfo = BlobInfo.newBuilder(blobId).setContentType("audio/wav").build();
Blob blob = storage.create(blobInfo, audio);
FixedCredentialsProvider credentialsProvider = FixedCredentialsProvider.create(credentials);
SpeechSettings speechSettings =
SpeechSettings.newBuilder()
.setCredentialsProvider(credentialsProvider)
.build();
String gcsUri = "gs://isaidusaid/testfile.wav";
SpeechClient speech = SpeechClient.create(speechSettings);
String parent = "projects/isaidusaid/locations/global";
RecognitionConfig recognitionConfig = RecognitionConfig.newBuilder()
.setExplicitDecodingConfig(ExplicitDecodingConfig.newBuilder().setEncoding(ExplicitDecodingConfig.AudioEncoding.LINEAR16).setSampleRateHertz(16000).build())
.addLanguageCodes("en-US")
.setModel("long").build();
BatchRecognizeFileMetadata metadata = BatchRecognizeFileMetadata.newBuilder().setUri(gcsUri).build();
RecognitionOutputConfig outputConfig = RecognitionOutputConfig.newBuilder().setInlineResponseConfig(
InlineOutputConfig.newBuilder().build()
).build();
BatchRecognizeRequest request = BatchRecognizeRequest.newBuilder()
.setConfig(recognitionConfig)
.addFiles(metadata)
.setRecognitionOutputConfig(outputConfig)
.build();
BatchRecognizeResponse response = speech.batchRecognizeOperationCallable().call(request);
StringBuilder builder = new StringBuilder();
for (SpeechRecognitionResult result : response.getResultsMap().get(gcsUri).getInlineResult().getTranscript().getResultsList()) {
// There can be several alternative transcripts for a given chunk of speech. Just use the
// first (most likely) one here.
if (result.getAlternativesCount() > 0) {
SpeechRecognitionAlternative alternative = result.getAlternativesList().get(0);
builder.append(alternative.getTranscript());
}
}
storage.delete(blobId);
return builder.toString();
}
User | Count |
---|---|
2 | |
2 | |
1 | |
1 | |
1 |