使用 azure 语音识别的说话者身份

问题描述 投票:0回答:1

我正在尝试识别不同的发言者 ID,并希望显示他们的对话及其 ID/姓名。这是我的代码。但我在这一行 'var seller = e.Result.Properties.GetProperty(PropertyId**.Speaker**)' 上收到错误,PropertyId 不包含“Speaker”的定义。 我是使用语音识别服务的新手,所以任何人都可以指导我该怎么做以及为什么我会收到此错误。我正在使用 1.35.0 版本的语音识别服务并使用 winforms c#。

private async void ProcessWavFile(string filePath)
{
    try
    {
        // Replace with your subscription key and region
        string subscriptionKey = "mykey";
        string region = "eastus2";

        // Configure speech recognizer for the WAV file
        var config = SpeechConfig.FromSubscription(subscriptionKey, region);
        using (var audioConfig = AudioConfig.FromWavFileInput(filePath))
        using (var recognizer = new SpeechRecognizer(config, audioConfig))
        {
            // Subscribe to Recognized event for continuous recognition
            recognizer.Recognized += async (s, e) =>
            {
                if (e.Result.Reason == ResultReason.RecognizedSpeech)
                {
                    var speaker = e.Result.Properties.GetProperty(PropertyId**.Speaker**);
                    if (speaker != null)
                    {
                        var speakerId = speaker.ToString();
                        // Use the speaker ID as needed
                        recognizedTextBox.Invoke((MethodInvoker)delegate
                        {
                            recognizedTextBox.AppendText($"Speaker ID: {speakerId}, Text: {e.Result.Text}{Environment.NewLine}");
                        });
                    }
                    else
                    {
                        recognizedTextBox.Invoke((MethodInvoker)delegate
                        {
                            recognizedTextBox.AppendText($"Speaker ID not available, Text: {e.Result.Text}{Environment.NewLine}");
                        });
                    }
                }
            };

            // Start continuous recognition
            await recognizer.StartContinuousRecognitionAsync();

            // Wait for recognition to complete
            await Task.Delay(TimeSpan.FromSeconds(100)); // Adjust the delay as needed

            // Stop continuous recognition
            await recognizer.StopContinuousRecognitionAsync();
        }
    }
    catch (Exception ex)
    {
        // Handle any exceptions that occur during processing
        MessageBox.Show($"An error occurred: {ex.Message}", "Error", MessageBoxButtons.OK, MessageBoxIcon.Error);
    }
}

我想用名称或 ID 分别识别音频中所有不同的说话者。

azure winforms azure-cognitive-services
1个回答
0
投票

我已使用默认扬声器从下面的 WinForms 应用程序成功检索了扬声器 ID。

代码:

using System;
using System.Windows.Forms;
using Microsoft.CognitiveServices.Speech;
using Microsoft.CognitiveServices.Speech.Audio;
using System.Threading.Tasks;

namespace WinFormsApp2
{
    public partial class Form1 : Form
    {
        private SpeechRecognizer recognizer;
        private string currentSpeakerId = "Guest-1"; 
        public Form1()
        {
            InitializeComponent();
        }

        private async void Form1_Load(object sender, EventArgs e)
        {
            string filePath = "path/to/.wav file";
            string subscriptionKey = "<speech_key>";
            string region = "<speech_reion>";
            await ProcessWavFile(filePath, subscriptionKey, region);
        }

        private async Task ProcessWavFile(string filePath, string subscriptionKey, string region)
        {
            try
            {
                var config = SpeechConfig.FromSubscription(subscriptionKey, region);
                using (var audioConfig = AudioConfig.FromWavFileInput(filePath))
                {
                    recognizer = new SpeechRecognizer(config, audioConfig);
                    recognizer.Recognized += async (s, e) =>
                    {
                        if (e.Result.Reason == ResultReason.RecognizedSpeech)
                        {
                            recognizedTextBox.Invoke((MethodInvoker)delegate
                            {
                                recognizedTextBox.AppendText($"Speaker ID: {currentSpeakerId}, Text: {e.Result.Text}{Environment.NewLine}");
                            });
                        }
                    };
                    await recognizer.StartContinuousRecognitionAsync();
                    await Task.Delay(TimeSpan.FromSeconds(100)); 
                    await recognizer.StopContinuousRecognitionAsync();
                }
            }
            catch (Exception ex)
            {
                MessageBox.Show($"An error occurred: {ex.Message}", "Error", MessageBoxButtons.OK, MessageBoxIcon.Error);
            }
        }
    }
}

Form1.Designer.cs:

namespace WinFormsApp2
{
    partial class Form1
    {
        private System.ComponentModel.IContainer components = null;
        protected override void Dispose(bool disposing)
        {
            if (disposing && (components != null))
            {
                components.Dispose();
            }
            base.Dispose(disposing);
        }

        #region Windows Form Designer generated code
        private void InitializeComponent()
        {
            this.recognizedTextBox = new System.Windows.Forms.TextBox();
            this.SuspendLayout();

            this.recognizedTextBox.Location = new System.Drawing.Point(12, 12);
            this.recognizedTextBox.Multiline = true;
            this.recognizedTextBox.Name = "recognizedTextBox";
            this.recognizedTextBox.ScrollBars = System.Windows.Forms.ScrollBars.Vertical;
            this.recognizedTextBox.Size = new System.Drawing.Size(776, 426);
            this.recognizedTextBox.TabIndex = 0;

            this.AutoScaleDimensions = new System.Drawing.SizeF(6F, 13F);
            this.AutoScaleMode = System.Windows.Forms.AutoScaleMode.Font;
            this.ClientSize = new System.Drawing.Size(800, 450);
            this.Controls.Add(this.recognizedTextBox);
            this.Name = "Form1";
            this.Text = "Speech Transcription";
            this.Load += new System.EventHandler(this.Form1_Load);
            this.ResumeLayout(false);
            this.PerformLayout();

        }

        #endregion

        private System.Windows.Forms.TextBox recognizedTextBox;
    }
}

输出:

WinForms 项目成功运行,提供了Speaker ID 和文本输出,如下所示。

Speaker ID: Guest-1, Text: Hello, this is a test of the speech synthesis service.

enter image description here

© www.soinside.com 2019 - 2024. All rights reserved.