Azure 语音转文本和 TTS 正在自言自语

问题描述 投票:0回答:1

希望这是一个“哦,这比我想象的要简单”..但我似乎无法使用 Azure C# 进行双工文本到语音和语音到文本的操作,而“听”者不会听到“说”的声音'..创建了一点无限循环...

问题:有没有办法过滤掉应用程序的声音,使其听不到自己的声音,但可以听到用户是否打断它并处理传入的音频?

我意识到耳机可以解决这个问题,但我在开放式扬声器上需要它..

非常感谢任何帮助或指导!谢谢!

到目前为止,我有一个非常标准的功能来通过麦克风收听音频,并将找到的文本流式传输到事件

        public async Task Listen()
        {
          
            var stopRecognition = new TaskCompletionSource<int>(TaskCreationOptions.RunContinuationsAsynchronously);
            using var audioProcessingOptions = AudioProcessingOptions.Create(AudioProcessingConstants.AUDIO_INPUT_PROCESSING_ENABLE_DEFAULT);
            using var audioInput = AudioConfig.FromDefaultMicrophoneInput(audioProcessingOptions);
             

            using var recognizer = new SpeechRecognizer(Config, audioInput);

            recognizer.Recognized += Recognizer_Recognized;


            await recognizer.StartContinuousRecognitionAsync().ConfigureAwait(false);

            // Waits for completion.
            // Use Task.WaitAny to keep the task rooted.
            Task.WaitAny(new[] { stopRecognition.Task });

            // Stops recognition.
            await recognizer.StopContinuousRecognitionAsync().ConfigureAwait(false);

        }
  private  void Recognizer_Recognized(object? sender, SpeechRecognitionEventArgs e)
      {
      // push the decoded audio in text to an event to display on screen...
      }

然后,当应用程序想说几句话时,它会调用下面的

问题:我可以在说话时停止监听,但我的应用程序往往会说话很多!所以我希望它时不时地被打断,以便继续进行事情! ..但是如果我听音频,它就会听到自己的声音!啊啊!

  public async Task Talk(string text)
        {
            // To support Chinese Characters on Windows platform
            if (Environment.OSVersion.Platform == PlatformID.Win32NT)
            {
                Console.InputEncoding = System.Text.Encoding.Unicode;
                Console.OutputEncoding = System.Text.Encoding.Unicode;
            }


            // Set the voice name, refer to https://aka.ms/speech/voices/neural for full list.
            Config.SpeechSynthesisVoiceName = "en-AU-CarlyNeural";
            //https://learn.microsoft.com/en-us/azure/ai-services/speech-service/language-support?tabs=tts

            // Creates a speech synthesizer using the default speaker as audio output.
            using var synthesizer = new SpeechSynthesizer(Config);
            using var result = await synthesizer.SpeakTextAsync(text);
            if (result.Reason == ResultReason.SynthesizingAudioCompleted)
            {
                // hmmm
            }

            else if (result.Reason == ResultReason.Canceled)
            {
                var cancellation = SpeechSynthesisCancellationDetails.FromResult(result); 
            }
        }
c# azure speech-recognition text-to-speech speech-to-text
1个回答
0
投票

您面临的问题与音频反馈循环有关,该循环是由应用程序在尝试监听传入音频时听到自己的输出而引起的。为了防止这种反馈循环,您可以使用一种称为“音频闪避”或“音频抑制”的技术,在应用程序说话时暂时降低麦克风的灵敏度。

我对您的代码进行了一些更改,并获得了带有输入语音的文本输出。

代码

using System;
using System.Threading.Tasks;
using Microsoft.CognitiveServices.Speech;
using Microsoft.CognitiveServices.Speech.Audio;

public class SpeechService
{
    private readonly SpeechConfig Config;
    private SpeechRecognizer recognizer;

    public SpeechService(string subscriptionKey, string serviceRegion)
    {
        Config = SpeechConfig.FromSubscription(subscriptionKey, serviceRegion);
        recognizer = null;
    }

    public async Task Listen()
    {
        var stopRecognition = new TaskCompletionSource<int>(TaskCreationOptions.RunContinuationsAsynchronously);
        using var audioProcessingOptions = AudioProcessingOptions.Create(AudioProcessingConstants.AUDIO_INPUT_PROCESSING_ENABLE_DEFAULT);
        using var audioInput = AudioConfig.FromDefaultMicrophoneInput(audioProcessingOptions);

        recognizer = new SpeechRecognizer(Config, audioInput);
        recognizer.Recognized += Recognizer_Recognized;

        await recognizer.StartContinuousRecognitionAsync().ConfigureAwait(false);
        Task.WaitAny(new[] { stopRecognition.Task });
        await recognizer.StopContinuousRecognitionAsync().ConfigureAwait(false);
    }

    private void Recognizer_Recognized(object sender, SpeechRecognitionEventArgs e)
    {
        Console.WriteLine($"Recognized: {e.Result.Text}");
    }

    public async Task Talk(string text)
    {
        await recognizer.StopContinuousRecognitionAsync().ConfigureAwait(false);
        Config.SpeechSynthesisVoiceName = "en-AU-CarlyNeural";

        using var synthesizer = new SpeechSynthesizer(Config);
        using var result = await synthesizer.SpeakTextAsync(text);
        await recognizer.StartContinuousRecognitionAsync().ConfigureAwait(false);

        if (result.Reason == ResultReason.SynthesizingAudioCompleted)
        {
        }
        else if (result.Reason == ResultReason.Canceled)
        {
            var cancellation = SpeechSynthesisCancellationDetails.FromResult(result);
        }
    }

    public async Task Close()
    {
        if (recognizer != null)
        {
            await recognizer.StopContinuousRecognitionAsync().ConfigureAwait(false);
            recognizer.Dispose();
        }
    }
}

public class Program
{
    public static async Task Main(string[] args)
    {
        string subscriptionKey = "<speech_key>";
        string serviceRegion = "<speech_region>";

        var speechService = new SpeechService(subscriptionKey, serviceRegion);
        await speechService.Listen();

        while (true)
        {
            Console.Write("Enter text to speak (or 'exit' to quit): ");
            string input = Console.ReadLine();

            if (input.ToLower() == "exit")
            {
                break;
            }
            await speechService.Talk(input);
        }

        await speechService.Close();
    }
}

输出:

它运行良好,当我说出一些台词时,它给了我下面的文本输出。

enter image description here

© www.soinside.com 2019 - 2024. All rights reserved.