将音频数据编码为字符串(Flask)并解码(Javascript)

问题描述 投票:0回答:1

我有一个Python Flask应用程序,其方法如下所示。在该方法中,我使用 Azure 文本到语音从文本合成语音。

@app.route("/retrieve_speech", methods=['POST'])
def retrieve_speech():
    text= request.form.get('text')
    start = time.time()
    speech_key = "my key"
    speech_region = "my region"
    speech_config = speechsdk.SpeechConfig(subscription=speech_key, region=speech_region)
    speech_config.endpoint_id = "my endpoint"
    speech_config.speech_synthesis_voice_name = "voice name"
    speech_config.set_speech_synthesis_output_format(
        speechsdk.SpeechSynthesisOutputFormat.Audio24Khz160KBitRateMonoMp3)
    synthesizer = speechsdk.SpeechSynthesizer(speech_config=speech_config, audio_config=None)

    result = synthesizer.speak_text_async(text=text).get()

    if result.reason == speechsdk.ResultReason.SynthesizingAudioCompleted:
        # Convert to wav
        audio = AudioSegment.from_file(io.BytesIO(result.audio_data))
        duration = audio.duration_seconds
        data = io.BytesIO()
        audio.export(data, format='wav')
        data.seek(0)

        # Convert binary data to base64 string
        data = base64.b64encode(data.read()).decode('utf-8')

        speech_timing = time.time() - start
    elif result.reason == speechsdk.ResultReason.Canceled:
        cancellation_details = result.cancellation_details
        if cancellation_details.reason == speechsdk.CancellationReason.Error:
            logging.error("Azure speech synthesis failed: {}".format(cancellation_details.error_details))

    return jsonify(audio_data=data, speech_timing=str(speech_timing), other="other strings")

我在我的前端(网页)中使用 Javascript 使用 Flask 方法,如下所示:

    $.post("/retrieve_speech", { text: "This is a test" }).done(function (data) {
        var audio_data= data.audio_data;
        var speech_timing = data.speech_timing;
        var other = data.other;

        // Decode base64 string to binary
        var binaryData = atob(audioData);

        // Create an array of 8-bit unsigned integers
        var byteArray = new Uint8Array(binaryData.length);
        for(var i = 0; i < binaryData.length; i++) {
            byteArray[i] = binaryData.charCodeAt(i);
        }

        // Create a blob object from the byte array
        var blob = new Blob([byteArray], {type: 'audio/wav'});

        // Create a URL for the blob object
        var url = URL.createObjectURL(blob);

        // Play the audio
        var audio = new Audio(url);
        audio.play();

现在的问题是音频没有播放。此外,在 Flask 应用程序中,我收到以下消息:

Numba: Attempted to fork from a non-main thread, the TBB library may be in an invalid state in the child process.

合成语音是有效的,所以问题一定是在 Flask 应用程序中转换为 wav 或字符串和/或在 Javascript 中解码字符串。

我的代码有问题吗?

javascript python flask base64 azure-cognitive-services
1个回答
1
投票
  • 确保
    base64.b64encode
    将原始字节作为输入,而不是字符串:
    data = base64.b64encode(result.audio_data).decode('utf-8') 
  • 使用
    AudioSegment.from_wav
    ,因为您正在以 WAV 格式导出数据。

audio = AudioSegment.from_wav(io.BytesIO(result.audio_data))

    result = synthesizer.speak_text_async(text=text).get()

    if result.reason == speechsdk.ResultReason.SynthesizingAudioCompleted:
        # Convert MP3 data to base64 string
        data = base64.b64encode(result.audio_data).decode('utf-8')
        speech_timing = time.time() - start
    elif result.reason == speechsdk.ResultReason.Canceled:
        cancellation_details = result.cancellation_details
        if cancellation_details.reason == speechsdk.CancellationReason.Error:
            logging.error("Azure speech synthesis failed: {}".format(cancellation_details.error_details))

输出: enter image description here

enter image description here

© www.soinside.com 2019 - 2024. All rights reserved.