如何使用 HTML5 和 Javascript 检测用户何时对着麦克风说话以及何时不说话？

Question

我正在开发一个网络应用程序，我需要向用户提供可用麦克风的列表，以便他们可以选择自己喜欢的麦克风并进行测试（就像 google meet 如何允许测试麦克风或任何其他类似性质的应用程序）

我可以使用以下方式呈现连接设备的列表

navigator.mediaDevices.getUserMedia({audio: true})

现在，当用户选择麦克风时，我想检测用户何时说话、何时不说话，以便我可以显示一些视觉线索。

这就是我需要帮助的地方：

const constraints = { audio: { deviceId: { exact: selectedDeviceId } } };
    navigator.mediaDevices
      .getUserMedia(constraints)
      .then((stream) => {


          // What should I do here with incoming stream? 
          // So that I can react to the input coming from microphone
          // Can I make use of any stream event and check for some condition in the handler? to detect when user is speaking
          // If there is no such event, what are my options?


      })
      .catch(function handleError(error) {
        console.log("error: ", error);
      });

Answer 1

要检测用户何时通过麦克风输入说话或发出噪音，您可以使用 Web Audio API 分析来自麦克风的音频流数据。您可以通过创建 AudioContext 并设置 AnalyserNode 来分析音频数据来实现此目的。以下是有关如何执行此操作的分步指南：

const constraints = { audio: { deviceId: { exact: selectedDeviceId } } };

navigator.mediaDevices
  .getUserMedia(constraints)
  .then((stream) => {
    // Create an AudioContext
    const audioContext = new (window.AudioContext || window.webkitAudioContext)();

    // Create an AnalyserNode to analyze the audio data
    const analyser = audioContext.createAnalyser();
    analyser.fftSize = 256; // You can adjust this value for better accuracy

    // Connect the microphone stream to the AnalyserNode
    const microphoneStream = audioContext.createMediaStreamSource(stream);
    microphoneStream.connect(analyser);

    // Create a buffer to store the audio data
    const bufferLength = analyser.frequencyBinCount;
    const dataArray = new Uint8Array(bufferLength);

    // Function to detect speech
    function detectSpeech() {
      analyser.getByteFrequencyData(dataArray);

      // Calculate the average amplitude of the audio data
      const averageAmplitude = dataArray.reduce((acc, value) => acc + value, 0) / bufferLength;

      // You can set a threshold for what constitutes "speech"
      const speechThreshold = 100; // Adjust this threshold as needed

      if (averageAmplitude > speechThreshold) {
        // User is speaking or making noise
        console.log('User is speaking.');
        // Add your visual clue here
      } else {
        // User is not speaking
        console.log('User is not speaking.');
        // Remove or hide the visual clue
      }

      // Call the function recursively to continuously monitor audio
      requestAnimationFrame(detectSpeech);
    }

    // Start monitoring the microphone input for speech
    detectSpeech();
  })
  .catch(function handleError(error) {
    console.log("error: ", error);
  });

在此代码中：

我们创建一个 AudioContext 来处理音频数据。
我们创建一个 AnalyserNode 来分析音频数据。
我们将麦克风流连接到AnalyserNode。
我们设置了一个缓冲区和一个dataArray来存储和分析音频数据。
我们定义了一个
```
detectSpeech
```
函数，用于计算音频数据的平均幅度并根据阈值确定用户是否正在说话。
我们使用
```
detectSpeech
```
不断调用
```
requestAnimationFrame
```
来监控音频输入。

当您认为用户正在说话时，您可以调整

speechThreshold

值进行微调。当用户说话时，您可以显示您的视觉线索，当他们不说话时，您可以隐藏或删除它。

如何使用 HTML5 和 Javascript 检测用户何时对着麦克风说话以及何时不说话？

问题描述投票：0回答：1

1个回答

最新问题

如何使用 HTML5 和 Javascript 检测用户何时对着麦克风说话以及何时不说话？

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1