Python语音识别库 - 总是听？

Question

我最近一直在使用python中的语音识别库来启动应用程序。我打算最终使用Raspberry Pi GPIO将库用于语音激活的家庭自动化。

我有这个工作，它检测我的声音并启动应用程序。问题是它似乎挂在我说的一个词上（例如，我说互联网并且无限次地启动chrome）

这是我在while循环中看到的异常行为。我无法弄清楚如何阻止它循环。我是否需要在循环中做一些事情以使其正常工作？请参阅下面的代码。

import pyaudio,os
import speech_recognition as sr
r = sr.Recognizer()
with sr.Microphone() as source:
        audio = r.listen(source)

def excel():
        os.system("start excel.exe")

def internet():
        os.system("start chrome.exe")

def media():
        os.system("start wmplayer.exe")

def mainfunction():
        user = r.recognize(audio)
        print(user)
        if user == "Excel":
                excel()
        elif user == "Internet":
                internet()
        elif user == "music":
                media()
while 1:
        mainfunction()

Answer 1

以防万一，以下是如何在pocketsphinx中持续监听关键字的示例，这比连续向谷歌发送音频更容易。你可以有更灵活的解决方案。

import sys, os, pyaudio
from pocketsphinx import *

modeldir = "/usr/local/share/pocketsphinx/model"
# Create a decoder with certain model
config = Decoder.default_config()
config.set_string('-hmm', os.path.join(modeldir, 'hmm/en_US/hub4wsj_sc_8k'))
config.set_string('-dict', os.path.join(modeldir, 'lm/en_US/cmu07a.dic'))
config.set_string('-keyphrase', 'oh mighty computer')
config.set_float('-kws_threshold', 1e-40)

decoder = Decoder(config)
decoder.start_utt('spotting')

stream = p.open(format=pyaudio.paInt16, channels=1, rate=16000, input=True, frames_per_buffer=1024)
stream.start_stream()        

while True:
    buf = stream.read(1024)
    decoder.process_raw(buf, False, False)
    if decoder.hyp() != None and decoder.hyp().hypstr == 'oh mighty computer':
        print "Detected keyword, restarting search"
        decoder.end_utt()
        decoder.start_utt('spotting')

Answer 2

问题是你实际上只在程序开始时听一次语音，然后只是在同一位保存的音频上重复调用recognize。将实际侦听语音的代码移动到while循环中：

import pyaudio,os
import speech_recognition as sr


def excel():
        os.system("start excel.exe")

def internet():
        os.system("start chrome.exe")

def media():
        os.system("start wmplayer.exe")

def mainfunction(source):
    audio = r.listen(source)
    user = r.recognize(audio)
    print(user)
    if user == "Excel":
        excel()
    elif user == "Internet":
        internet()
    elif user == "music":
        media()

if __name__ == "__main__":
    r = sr.Recognizer()
    with sr.Microphone() as source:
        while 1:
            mainfunction(source)

Answer 3

我花了很多时间研究这个问题。

目前我正在开发一个名为Athena Voice的Python 3开源跨平台虚拟助手程序：https://github.com/athena-voice/athena-voice-client

用户可以像Siri，Cortana或Amazon Echo一样使用它。

它还使用一个非常简单的“模块”系统，用户可以轻松编写自己的模块来增强其功能。让我知道这是否有用。

否则，我建议查看Pocketsphinx和Google的Python语音到文本/文本到语音转换包。

在Python 3.4上，Pocketsphinx可以安装：

pip install pocketsphinx

但是，您必须单独安装PyAudio依赖项（非官方下载）：http://www.lfd.uci.edu/~gohlke/pythonlibs/#pyaudio

可以使用以下命令安装两个Google软件包：

pip install SpeechRecognition gTTS

谷歌STT：https://pypi.python.org/pypi/SpeechRecognition/

谷歌购物中心：jaazksvpoi

Pocketsphinx应该用于离线唤醒 - 单词识别，Google STT应该用于主动收听。

Answer 4

这很难过但是你必须在每个循环中初始化麦克风，因为这个模块总是有https://pypi.python.org/pypi/gTTS/1.0.2，这确保它在嘈杂的房间里也能理解你的声音。设置阈值需要时间，如果您不断发出命令，可以跳过一些单词

r.adjust_for_ambient_noise(source)

Python语音识别库 - 总是听？

问题描述投票：4回答：4

4个回答

最新问题

Python语音识别库 - 总是听？

问题描述 投票：4回答：4

4个回答

最新问题

问题描述投票：4回答：4