语音识别Python出现奇怪的请求错误

问题描述 投票:0回答:2

以下代码的语音识别根本不起作用

with sr.Microphone() as source:
# read the audio data from the default microphone
audio = r.record(source, duration=4)
print("Recognizing...")
# convert speech to text
# recognize speech using Google Speech Recognition
try:
    # for testing purposes, we're just using the default API key
    # to use another API key, use `r.recognize_google(audio, key="GOOGLE_SPEECH_RECOGNITION_API_KEY")`
    # instead of `r.recognize_google(audio)`
    print("Google Speech Recognition thinks you said in English: -  " + r.recognize_google(audio, language = "en-US"))
except sr.UnknownValueError:
    print("Google Speech Recognition could not understand audio")
except sr.RequestError as e:
    print("Could not request results from Google Speech Recognition service; {0}".format(e))

这是完整的错误,看起来请求只是失败了,但是如果我上传音频文件作为源,那么相同的代码似乎工作正常。我已经通过 sr.Microphone 检查过,默认选项也正确链接到我的实际麦克风...

    ---------------------------------------------------------------------------
HTTPError                                 Traceback (most recent call last)
~/anaconda3/lib/python3.7/site-packages/speech_recognition/__init__.py in recognize_google(self, audio_data, key, language, show_all)
    839         try:
--> 840             response = urlopen(request, timeout=self.operation_timeout)
    841         except HTTPError as e:

~/anaconda3/lib/python3.7/urllib/request.py in urlopen(url, data, timeout, cafile, capath, cadefault, context)
    221         opener = _opener
--> 222     return opener.open(url, data, timeout)
    223 

~/anaconda3/lib/python3.7/urllib/request.py in open(self, fullurl, data, timeout)
    530             meth = getattr(processor, meth_name)
--> 531             response = meth(req, response)
    532 

~/anaconda3/lib/python3.7/urllib/request.py in http_response(self, request, response)
    640             response = self.parent.error(
--> 641                 'http', request, response, code, msg, hdrs)
    642 

~/anaconda3/lib/python3.7/urllib/request.py in error(self, proto, *args)
    568             args = (dict, 'default', 'http_error_default') + orig_args
--> 569             return self._call_chain(*args)
    570 

~/anaconda3/lib/python3.7/urllib/request.py in _call_chain(self, chain, kind, meth_name, *args)
    502             func = getattr(handler, meth_name)
--> 503             result = func(*args)
    504             if result is not None:

~/anaconda3/lib/python3.7/urllib/request.py in http_error_default(self, req, fp, code, msg, hdrs)
    648     def http_error_default(self, req, fp, code, msg, hdrs):
--> 649         raise HTTPError(req.full_url, code, msg, hdrs, fp)
    650 

HTTPError: HTTP Error 400: Bad Request

During handling of the above exception, another exception occurred:

RequestError                              Traceback (most recent call last)
<ipython-input-109-50b94b08a896> in <module>
      3     audio = r.record(source, duration=4)
      4     print("Recognizing...")
----> 5     r.recognize_google(audio, language = "en-US")

~/anaconda3/lib/python3.7/site-packages/speech_recognition/__init__.py in recognize_google(self, audio_data, key, language, show_all)
    840             response = urlopen(request, timeout=self.operation_timeout)
    841         except HTTPError as e:
--> 842             raise RequestError("recognition request failed: {}".format(e.reason))
    843         except URLError as e:
    844             raise RequestError("recognition connection failed: {}".format(e.reason))

RequestError: recognition request failed: Bad Request

enter image description here

python speech-recognition speech google-speech-to-text-api
2个回答
8
投票

这可能不是您编写的代码的解决方案。但会帮助许多其他尝试使用 Google

SpeechRecognition
并获得相同模糊错误消息的人。

我通过将音频输入文件缩短为

less than 10 MB
解决了同样的问题。 目前,同步请求的配额约为 1 分钟(10 MB)。

引用文档

发送到 API 的所有单个请求的大小限制为 10 MB 使用本地文件。在识别的情况下 LongRunningRecognize 方法,此限制适用于 请求已发送。对于 StreamingRecognize 方法,10 MB 限制适用于初始 StreamingRecognize 请求和流中每个单独消息的大小。超过此限制将引发错误。


0
投票

正如@YetAnotherBot 提到的,可能的问题是尺寸太大。至少这发生在我身上。我通过将音频分成块解决了这个问题

from pydub import AudioSegment

def chunk_audio_and_save(audio_path, chunk_length=60000):  # chunk_length in milliseconds
    audio = AudioSegment.from_wav(audio_path)
    length_audio = len(audio)
    chunk_paths = []
    for i, chunk in enumerate(range(0, length_audio, chunk_length)):
        chunk_audio = audio[chunk:chunk + chunk_length]
        chunk_path = f"temp_chunk_{i}.wav"
        chunk_audio.export(chunk_path, format="wav")
        chunk_paths.append(chunk_path)
    return chunk_paths

然后进行转录:

for i, file_path in enumerate(chunk_file_paths):
    print(f"Transcribing chunk {i+1}/{len(chunk_file_paths)}...")
    transcript = transcribe_audio(file_path)
    full_transcript.append(transcript)
    os.remove(file_path)  # Clean up chunk file
© www.soinside.com 2019 - 2024. All rights reserved.