Azure 语音文本“_io.BytesIO”对象没有属性“_handle”

问题描述 投票:0回答:1

我正在尝试将包含某人说话的音频的 .wav 文件转换为所说内容的转录。这是一个移动应用程序,所以我使用 React Native 和 expo go 进行开发。音频被发送到 azure HTTP 触发函数,其中音频(编码为 Base64)被解码,尝试用于 azure 的语音识别。我已经确保 SDK 的采样率、通道和采样宽度都是正确的。

def speech_recognize_continuous_from_file(audio_data):
    speech_config = speechsdk.SpeechConfig(subscription=speech_key, region=service_region)

    # ERROR OCCURS HERE: stream=audio_data does not work
    audio_config = speechsdk.audio.AudioConfig(stream=audio_data)

    speech_recognizer = speechsdk.SpeechRecognizer(speech_config=speech_config, audio_config=audio_config)


def transcriptionFunction(req: func.HttpRequest) -> func.HttpResponse:
    logging.info('Python HTTP trigger function processed a request.')

    try:
        req_body = req.get_json()
        audioBase64 = req_body.get('audioBase64')

        # Converts base64 to wav
        decodedAudio = base64.b64decode(audioBase64)
        audioIO = io.BytesIO(decodedAudio)

        # Begins transcription
        speech_recognize_continuous_from_file(audioIO)
        

        return func.HttpResponse("Check Server Console for response", status_code=200)

我已经用 .wav 文件测试了我的语音识别连续功能,所以我知道它是有效的。我还检查了 .wav 文件的正确格式是正确的。由于这是一个无服务器功能,我无法使用 filename=,因为没有本地存储。

python azure-functions wav azure-cognitive-services speech-to-text
1个回答
0
投票

错误“_io.BytesIO”对象没有属性“_handle”,表明语音 SDK 未按预期识别流属性。

问题是由于将 BytesIO 对象直接传递给peechsdk.audio.AudioConfig(stream=audio_stream)而引起的。此构造函数需要一个类似文件的对象,但 BytesIO 对象没有 _handle 属性,从而导致错误。

要解决此问题,您可以在下面的行中使用 .wav 文件: speechsdk.audio.AudioConfig(filename="temp.wav") 而不是将原始音频数据直接传递给peechsdk.audio.AudioConfig()构造函数。这是修改后的代码:

代码

import logging
import azure.functions as func
import base64
import os
import azure.cognitiveservices.speech as speechsdk

speech_key = "<speech_key>"
service_region = "<speech_region>"

def speech_recognize_continuous_from_stream(audio_data):
    speech_config = speechsdk.SpeechConfig(subscription=speech_key, region=service_region)
    audio_config = speechsdk.audio.AudioConfig(filename="temp.wav")
    speech_recognizer = speechsdk.SpeechRecognizer(speech_config=speech_config, audio_config=audio_config)

    result = speech_recognizer.recognize_once()
    return result.text if result.reason == speechsdk.ResultReason.RecognizedSpeech else ""

def main(req: func.HttpRequest) -> func.HttpResponse:
    logging.info('Python HTTP trigger function processed a request.')

    try:
        req_body = req.get_json()
        audioBase64 = req_body.get('audioBase64')
        decodedAudio = base64.b64decode(audioBase64)
        
        with open("temp.wav", "wb") as audio_file:
            audio_file.write(decodedAudio)
        transcription_result = speech_recognize_continuous_from_stream("temp.wav")
        os.unlink("temp.wav")

        return func.HttpResponse(transcription_result, status_code=200)

    except Exception as e:
        logging.error(f"Error: {str(e)}")
        return func.HttpResponse("Internal Server Error", status_code=500)

邮递员输出

{
    "audioBase64":"your_base64_data"
}
Hello, this is a test of the speech synthesis service.

enter image description here

输出

运行成功如下图。

C:\Users\xxxxxxx\Documents\xxxxxxx>func start
Found Python version 3.10.11 (python).

Azure Functions Core Tools
Core Tools Version:       4.0.5030 Commit hash: N/A  (64-bit)
Function Runtime Version: 4.15.2.20177


Functions:

        HttpTrigger1: [GET,POST] http://localhost:7071/api/HttpTrigger1

For detailed output, run func with --verbose flag.
[2024-02-10T19:58:48.856Z] Worker process started and initialized.
[2024-02-10T19:58:54.658Z] Host lock lease acquired by instance ID '00000xxxxxxxxxxxxxxxxxx'.
[2024-02-10T19:58:56.634Z] Executing 'Functions.HttpTrigger1' (Reason='This function was programmatically called via the host APIs.', Id=3cd9c444b944xxxxxxxxxxxx)
[2024-02-10T19:58:56.843Z] Python HTTP trigger function processed a request.
[2024-02-10T19:59:00.598Z] Executed 'Functions.HttpTrigger1' (Succeeded, Id=3cd9c444xxxxxxxxxx, Duration=4040ms)

enter image description here

© www.soinside.com 2019 - 2024. All rights reserved.