在我的代码中(如下),当我通过STT处理时,它只给我整个音频的第一个字母词。
音频有 "A B C D E F"
我错过了什么?
Imports Microsoft.CognitiveServices.Speech
Imports Microsoft.CognitiveServices.Speech.SpeechConfig
Imports Microsoft.CognitiveServices.Speech.Audio
Module Module1
Sub Main()
Dim SpeechConfig As SpeechConfig = FromSubscription("<CHANGED>", "eastus")
Dim audioConfig As Audio.AudioConfig = Audio.AudioConfig.FromWavFileInput("<CHANGED>.wav")
SpeechConfig.OutputFormat = Microsoft.CognitiveServices.Speech.OutputFormat.Detailed
Dim recognizer As New SpeechRecognizer(SpeechConfig, audioConfig)
Dim result = recognizer.RecognizeOnceAsync().Result
Select Case result.Reason
Case ResultReason.RecognizedSpeech
Console.WriteLine($"RECOGNIZED: Text={result.Text}")
Console.WriteLine($" Intent not recognized.")
Case ResultReason.NoMatch
Console.WriteLine($"NOMATCH: Speech could not be recognized.")
Case ResultReason.Canceled
Dim cancellation = CancellationDetails.FromResult(result)
Console.WriteLine($"CANCELED: Reason={cancellation.Reason}")
If cancellation.Reason = CancellationReason.[Error] Then
Console.WriteLine($"CANCELED: ErrorCode={cancellation.ErrorCode}")
Console.WriteLine($"CANCELED: ErrorDetails={cancellation.ErrorDetails}")
Console.WriteLine($"CANCELED: Did you update the subscription info?")
End If
End Select
End Sub
End Module
你可以在这里下载github上的音频文件。https:/github.comullfindsmitStackOverflowAssetsblobmasterabcdef.wav。
另外,如果你知道我在哪里可以得到一个更详细的STT数据,我会很感激它.我正在寻找的是像一个JSON输出,说开始时间和结束时间与单词和or句子。
你的帮助是非常感激的。
更新因此,由于某些原因,异步处理程序没有为我工作,但是,下面的代码做到了。
While True
Dim result = recognizer.RecognizeOnceAsync().Result
Select Case result.Reason
Case ResultReason.RecognizedSpeech
Console.WriteLine($"RECOGNIZED: Text={result.Text}")
Console.WriteLine($" Intent not recognized.")
Case ResultReason.NoMatch
Console.WriteLine($"NOMATCH: Speech could not be recognized.")
Case ResultReason.Canceled
Dim cancellation = CancellationDetails.FromResult(result)
Console.WriteLine($"CANCELED: Reason={cancellation.Reason}")
If cancellation.Reason = CancellationReason.[Error] Then
Console.WriteLine($"CANCELED: ErrorCode={cancellation.ErrorCode}")
Console.WriteLine($"CANCELED: ErrorDetails={cancellation.ErrorDetails}")
Console.WriteLine($"CANCELED: Did you update the subscription info?")
End If
Exit While
End Select
End While
的.... RecognizeOnceAsync
方法将只识别 "一次"......音频数据文件中包含的第一个 "语句"。如果你想识别一个以上的短语,你可以做以下两件事之一。
调用 RecognizeOnceAsync
反复... 识别出最后一个短语后,下一次调用该方法时,将返回一个结果,该结果有 result.Reason
设为 Canceled
.
转换使用 RecognizeOnceAsync
到使用 StartContinuousRecognitionAsync
并将一个事件手柄连接到 Recognizing
事件。事件回调将允许你通过检查的 SpeechRecognitionEventArgs
通过,像这样。e.Result
...
你可以通过运行Speech CLI看到这两个行为,就像这样。
spx recognize --once+ --key YOUR-KEY --region YOUR-REGION --file "https://github.com/ullfindsmit/StackOverflowAssets/blob/master/abcdef.wav"
spx recognize --continuous --key YOUR-KEY --region YOUR-REGION --file "https://github.com/ullfindsmit/StackOverflowAssets/blob/master/abcdef.wav"
你可以在这里下载Speech CLI: https:/aka.msspeechspx-zips.zip。