Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

Why is sample rate optional only for FLAC or WAV file and not other formats?

So for example at my work we are using WEBM_OPUS encoding, which from what I understand, specificies the sample rate in audio stream metadata itself? Yet from here: https://cloud.google.com/speech-to-text/docs/basics#sample-rates it says the field is only optional  for FLAC or WAV formats.

And indeed, when I try the GSTT API with some example code (Streaming Recognition and a WEBM_OPUS encoded at 48000 sample rate), the GSTT actually accepts sample rates other than 48000 - and depending on the recognition model, produces different results depending on the sample rate selected!

0 5 564
5 REPLIES 5