您好,我已尝试使用 azure 语音转文本对音频文件进行语音转文本,一切正常。有人可以指导我如何使用音频 url 在 azure 中将语音转换为文本吗?我正在使用 REST API。
支持 Nicolas 所说的:如果您想使用 REST API 将音频 URL 转换为文本,批量转录是最佳选择。
以及用于批处理的 JS 示例:https://github.com/Azure-Samples/cognitive-services-speech-sdk/tree/master/samples/batch/js/node
公共异步任务GenerateAudioTranscript(字符串audioFilePath)
{
尝试
{
var voiceConfig = SpeechConfig.FromSubscription(_configuration["SpeechCognitive:SubscriptionKey"], _configuration["SpeechCognitive:Region"]);
voiceConfig.RequestWordLevelTimestamps();
voiceConfig.OutputFormat = Microsoft.CognitiveServices.Speech.OutputFormat.Detailed;
var audioConfig = AudioConfig.FromWavFileInput(audioFilePath);
var recognizer = new SpeechRecognizer(speechConfig, audioConfig);
var speechRecognizerWaiter = new TaskCompletionSource<string>();
var transcript = new List<AudioTranscriptDto>();
int sequenceNumber = 1;
recognizer.Recognized += (s, e) =>
{
if (e.Result.Reason == ResultReason.RecognizedSpeech)
{
string text = e.Result.Text;
var duration = e.Result.Duration;
long offset = e.Result.OffsetInTicks;
var startTime = TimeSpan.FromTicks(offset);
var endTime = startTime + duration;
var srtBlock = $"{sequenceNumber}\n{FormatTimeSpan(startTime)} --> {FormatTimeSpan(endTime)}\n{text}\n\n";
transcript.Add(new AudioTranscriptDto
{
SequenceNumber = sequenceNumber,
StartTime = FormatTimeSpan(startTime),
EndTime = FormatTimeSpan(endTime),
Text = text
});
sequenceNumber++;
}
};
recognizer.SessionStarted += (sender, e) =>
{
Console.WriteLine("-----------> started");
};
recognizer.SessionStopped += (sender, e) =>
{
Console.WriteLine("-----------> stooped");
speechRecognizerWaiter.SetResult("Recognition completed");
};
await recognizer.StartContinuousRecognitionAsync();
var str = await speechRecognizerWaiter.Task;
await recognizer.StopContinuousRecognitionAsync();
return transcript;
}
catch (Exception ex)
{
throw;
}
}
静态字符串FormatTimeSpan(TimeSpan timeSpan) { return $"{timeSpan.Hours:D2}:{timeSpan.Minutes:D2}:{timeSpan.Seconds:D2},{timeSpan.Milliseconds:D3}"; }在此输入代码