使用认知服务更改口头关键字后的超时

问题描述 投票:0回答:1

我想创建一种类似于 Siri 或 Alexa 的声控个人助理。说一个关键字,然后将其余音频处理为文本。我有一个工作版本,我可以做到这一点。但是,如果您说出关键字并稍等片刻,它就会超时。我无法说出关键字,等待 1 或 2 秒,然后说出其余的命令。

我希望能够说出关键字并让它在实际超时之前等待 10 或 15 秒。

我试过设置这些属性,但没有任何改变。

SpeechConfig.SetProperty(PropertyId.SpeechServiceConnection_InitialSilenceTimeoutMs, "15000");
SpeechConfig.SetProperty(PropertyId.SpeechServiceConnection_EndSilenceTimeoutMs, "15000");

SpeechRecognizer.Properties.SetProperty(PropertyId.SpeechServiceConnection_InitialSilenceTimeoutMs, "15000");
SpeechRecognizer.Properties.SetProperty(PropertyId.SpeechServiceConnection_EndSilenceTimeoutMs, "15000");

我在用

SpeechRecognizer.StartKeywordRecognitionAsync()

做识别。我试图用

阻止它
SpeedRecognizer.StopKeywordRecognitionAsync()

然后使用

SpeechRecognizer.StartContinousRecognitionAsync()

在 SessionStarted、SessionStopped、Recognizing 或 Recognized 事件中的任何一个。永远不会调用已取消的事件。

我原以为它会在说出关键字后等待,但没有。有谁知道如何做到这一点?我错过了什么?

c# .net speech-to-text microsoft-cognitive speech
1个回答
0
投票

我能够通过阅读此处的文档来弄清楚: https://learn.microsoft.com/en-us/azure/cognitive-services/speech-service/

基本前提是先创建一个KeywordRecognizer。然后调用识别器函数获取关键字。结果是一个 RecognizedKeyword,您可以从那里创建一个 SpeechRecognizer。调用识别器函数,您将获得命令的其余部分。默认延迟是从获取关键字到超时30秒。

这里有一些代码来举个例子:

    using System;
    using System.Threading.Tasks;
    using Microsoft.CognitiveServices.Speech;
    using Microsoft.CognitiveServices.Speech.Audio;
    
    namespace SpeechRecognitionDemo
    {
        class Program
        {
            static SpeechConfig speechConfig;
            static KeywordRecognitionModel keywordModel;
            static AudioConfig audioConfig;
            static TaskCompletionSource<int> stopRecognition;
    
            static async Task Main(string[] args)
            {
                // Creates an instance of a speech config with specified subscription key and service region.
                // Replace with your own subscription key and service region (e.g., "westus").
                speechConfig = SpeechConfig.FromSubscription("subscription key", "region");
                speechConfig.SpeechRecognitionLanguage = "en-US";
    
                // set this property to allow more time between words in the command
                speechConfig.SetProperty(PropertyId.Speech_SegmentationSilenceTimeoutMs, "2000");
    
                // Creates an instance of a keyword recognition model. Update this to
                // point to the location of your keyword recognition model.
                keywordModel = KeywordRecognitionModel.FromFile("keywords.table");
                audioConfig = AudioConfig.FromDefaultMicrophoneInput();
    
                await RunAssistant();
            }
    
            static async Task RunAssistant()
            {
                bool keepRunning = true;
    
                while (keepRunning)
                {
                    // Starts recognizing.
                    Console.WriteLine($"Say something starting with the keyword 'Hey Assistant' followed by whatever you want...");
    
                    stopRecognition = new TaskCompletionSource<int>(TaskCreationOptions.RunContinuationsAsynchronously);
    
                    using (var keywordRecognizer = new KeywordRecognizer(audioConfig))
                    {
                        // recognize the keywords
                        KeywordRecognitionResult result = await keywordRecognizer.RecognizeOnceAsync(keywordModel);
    
                        if (result.Reason == ResultReason.RecognizedKeyword)
                        {
                            Console.WriteLine($"RECOGNIZED KEYWORD: Text={result.Text}");
    
                            using (var speechRecognizer = new SpeechRecognizer(speechConfig, audioConfig))
                            {
                                // Subscribes to events.
                                speechRecognizer.Recognizing += (s, e) =>
                                {
                                    if (e.Result.Reason == ResultReason.RecognizingSpeech)
                                    {
                                        Console.WriteLine($"RECOGNIZING: Text={e.Result.Text}");
                                    }
                                };
    
                                speechRecognizer.Recognized += (s, e) =>
                                {
                                    if (e.Result.Reason == ResultReason.RecognizedSpeech)
                                    {
                                        Console.WriteLine($"RECOGNIZED: Text={e.Result.Text}");
                                    }
                                    else if (e.Result.Reason == ResultReason.NoMatch)
                                    {
                                        Console.WriteLine("NOMATCH: Speech could not be recognized.");
                                    }
                                };
    
                                speechRecognizer.SessionStarted += (s, e) =>
                                {
                                    Console.WriteLine("\nSession started event.\n");
                                };
    
                                speechRecognizer.SessionStopped += (s, e) =>
                                {
                                    Console.WriteLine("\nSession stopped event.");
                                    Console.WriteLine("\nStop recognition.");
    
                                    stopRecognition.TrySetResult(0);
                                };

                                // now recognize the commands
                                await speechRecognizer.RecognizeOnceAsync();
                            }
                        }
    
                        if (result.Reason == ResultReason.Canceled)
                        {
                            Console.WriteLine($"CANCELLED KEYWORD");
                            stopRecognition.TrySetResult(0);
                        }
    
                        if (result.Reason == ResultReason.NoMatch)
                        {
                            Console.WriteLine($"NO MATCH KEYWORD");
                        }
    
                        // Use Task.WaitAny to keep the task rooted.
                        Task.WaitAny(new[] { stopRecognition.Task });
    
                        Console.WriteLine("\n");
                    }
                }
                audioConfig.Dispose();
            }
        }
    }

您需要使用 Speech Studio 创建一个 keywords.table 文件,这是不言自明的。您还需要一个订阅 ID,然后下载一个模型以供离线使用。

这个例子等待一个关键字,然后等待更多的文本。它将结果打印到控制台,然后回来重新做一遍。

© www.soinside.com 2019 - 2024. All rights reserved.