gRPC bidirectional stream application semantics

I am looking for some guidance based on experience or expertise on following scenario:

We have a client that communicates with server on gRPC bi-directional stream. It sends audio configuration to use for transcription and then audio data very similar to Google Cloud Speech-To-Text.

The question is, should we make client wait for an acknowledgement for server that the audio configuration is valid and client can proceed to send audio data or should we not enforce waiting?

Thanks in advance.

0 0 74