实时语音识别

问题描述 投票:12回答:1

我有一个Python脚本,使用了Speech_recognition包来识别语音并返回所讲内容。但是,转录有几秒钟的延迟。还有另一种编写此脚本的方法来返回每个单词的发音吗?我有另一个脚本使用pysphinx包来执行此操作,但是结果非常不准确。

安装依赖项:

pip install SpeechRecognition
pip install pocketsphinx

脚本1-语音到文本延迟:

import speech_recognition as sr  

# obtain audio from the microphone  
r = sr.Recognizer()  
with sr.Microphone() as source:  
    print("Please wait. Calibrating microphone...")  
    # listen for 5 seconds and create the ambient noise energy level  
    r.adjust_for_ambient_noise(source, duration=5)  
    print("Say something!")  
    audio = r.listen(source)  

    # recognize speech using Sphinx  
    try:  
        print("Sphinx thinks you said '" + r.recognize_sphinx(audio) + "'")  
    except sr.UnknownValueError:  
        print("Sphinx could not understand audio")  
    except sr.RequestError as e:  
        print("Sphinx error; {0}".format(e))

脚本2-立即语音到文本,尽管不准确:

import os
from pocketsphinx import LiveSpeech, get_model_path

model_path = get_model_path()
speech = LiveSpeech(
    verbose=False,
    sampling_rate=16000,
    buffer_size=2048,
    no_search=False,
    full_utt=False,
    hmm=os.path.join(model_path, 'en-us'),
    lm=os.path.join(model_path, 'en-us.lm.bin'),
    dic=os.path.join(model_path, 'cmudict-en-us.dict')
)
for phrase in speech:
    print(phrase)
python speech-recognition speech-to-text cmusphinx pocketsphinx
1个回答
2
投票

如果您碰巧具有启用CUDA的GPU,则可以尝试Mozilla的DeepSpeech GPU库。它们还具有CPU版本,以防您没有启用CUDA的GPU。CPU使用DeepSpeech以1.3倍的时间记录音频文件,而在GPU上,速度为0.3倍,即它以0.33秒的时间记录1秒的音频文件。快速入门:

# Create and activate a virtualenv
virtualenv -p python3 $HOME/tmp/deepspeech-gpu-venv/
source $HOME/tmp/deepspeech-gpu-venv/bin/activate

# Install DeepSpeech CUDA enabled package
pip3 install deepspeech-gpu

# Transcribe an audio file.
deepspeech --model deepspeech-0.6.1-models/output_graph.pbmm --lm deepspeech- 
0.6.1-models/lm.binary --trie deepspeech-0.6.1-models/trie --audio audio/2830- 
3980-0043.wav

[一些重要说明-Deepspeech-gpu具有一些依赖项,例如tensorflow,CUDA,cuDNN等。因此,请查看其github存储库以获取更多详细信息-https://github.com/mozilla/DeepSpeech

© www.soinside.com 2019 - 2024. All rights reserved.