Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

Error: Speaker Diarization is not working in Speech-to-text API

Dear,

I am using Speech-to-text API to transcribe medical conversation (enhanced model) . I know the conversation is between the doctor and the patient. I am also using the following configuration to request the transcription:

client = speech.SpeechClient()

audio = speech.RecognitionAudio(uri=gcs_uri)

diarization_config = speech.SpeakerDiarizationConfig(
enable_speaker_diarization=True,
#min_speaker_count=2,
max_speaker_count=2,
)
config = speech.RecognitionConfig(
encoding=speech.RecognitionConfig.AudioEncoding.FLAC,
sample_rate_hertz=32000,
language_code="en-US",
use_enhanced=True,
diarization_config=diarization_config,
# A model must be specified to use enhanced model.
model="medical_conversation",
)
However, I get 5 speakers (here a sample of it):
AnthonyRuiz1_0-1698457852354.png
AnthonyRuiz1_0-1698460811930.png

 

Please, do you know how to solve this?

In advance, thank you .

0 0 417
0 REPLIES 0