用于分类的 Azure 语音私人预览版之前设置了“未知”演讲者标签,直到它识别出演讲者长达 7 秒的声明,而公共预览版中的 api 则开始标记 guest-n,这带来了准确性问题,即使 guest-1检测到并收到短句子,它会被标记为 guest-2,直到 guest-2 说出长句子,同样
是否有解决方案可以恢复私人预览行为?
是否有解决方案可以恢复私人预览行为?
根据文档,他们仍然说它将把较短的句子标记为未知
使用的sdk版本 实施组:'com.microsoft.cognitiveservices.speech',名称:'client-sdk',版本:'1.34.0'
二值化被描述为根据每个片段中说话者的身份将包含多个说话者的音频分割成离散语音片段的过程。
注意: 实时二值化目前处于公共预览版。
private static String speechKey = "SPEECH_KEY";
private static String speechRegion = "SPEECH_REGION";
public static void main(String[] args) throws InterruptedException, ExecutionException {
SpeechConfig speechConfig = SpeechConfig.fromSubscription(speechKey, speechRegion);
speechConfig.setSpeechRecognitionLanguage("en-US");
AudioConfig audioInput = AudioConfig.fromWavFileInput("katiesteve.wav");
Semaphore stopRecognitionSemaphore = new Semaphore(0);
ConversationTranscriber conversationTranscriber = new ConversationTranscriber(speechConfig, audioInput);
{
// Subscribes to events.
conversationTranscriber.transcribing.addEventListener((s, e) -> {
System.out.println("TRANSCRIBING: Text=" + e.getResult().getText());
});
conversationTranscriber.transcribed.addEventListener((s, e) -> {
if (e.getResult().getReason() == ResultReason.RecognizedSpeech) {
System.out.println("TRANSCRIBED: Text=" + e.getResult().getText() + " Speaker ID=" + e.getResult().getSpeakerId() );
}
else if (e.getResult().getReason() == ResultReason.NoMatch) {
System.out.println("NOMATCH: Speech could not be transcribed.");
}
});
conversationTranscriber.canceled.addEventListener((s, e) -> {
System.out.println("CANCELED: Reason=" + e.getReason());
if (e.getReason() == CancellationReason.Error) {
System.out.println("CANCELED: ErrorCode=" + e.getErrorCode());
System.out.println("CANCELED: ErrorDetails=" + e.getErrorDetails());
System.out.println("CANCELED: Did you update the subscription info?");
}
stopRecognitionSemaphore.release();
});
conversationTranscriber.sessionStarted.addEventListener((s, e) -> {
System.out.println("\n Session started event.");
});
conversationTranscriber.sessionStopped.addEventListener((s, e) -> {
System.out.println("\n Session stopped event.");
});
conversationTranscriber.startTranscribingAsync().get();
// Waits for completion.
stopRecognitionSemaphore.acquire();
conversationTranscriber.stopTranscribingAsync().get();
}
speechConfig.close();
audioInput.close();
conversationTranscriber.close();
System.exit(0);
}
输出: