我有一个Python Flask应用程序,其方法如下所示。在该方法中,我使用 Azure 文本到语音从文本合成语音。

@app.route("/retrieve_speech", methods=['POST'])
def retrieve_speech():
    text= request.form.get('text')
    start = time.time()
    speech_key = "my key"
    speech_region = "my region"
    speech_config = speechsdk.SpeechConfig(subscription=speech_key, region=speech_region)
    speech_config.endpoint_id = "my endpoint"
    speech_config.speech_synthesis_voice_name = "voice name"
    synthesizer = speechsdk.SpeechSynthesizer(speech_config=speech_config, audio_config=None)

    result = synthesizer.speak_text_async(text=text).get()

    if result.reason == speechsdk.ResultReason.SynthesizingAudioCompleted:
        # Convert to wav
        audio = AudioSegment.from_file(io.BytesIO(result.audio_data))
        duration = audio.duration_seconds
        data = io.BytesIO()
        audio.export(data, format='wav')

        # Convert binary data to base64 string
        data = base64.b64encode(data.read()).decode('utf-8')

        speech_timing = time.time() - start
    elif result.reason == speechsdk.ResultReason.Canceled:
        cancellation_details = result.cancellation_details
        if cancellation_details.reason == speechsdk.CancellationReason.Error:
            logging.error("Azure speech synthesis failed: {}".format(cancellation_details.error_details))

    return jsonify(audio_data=data, speech_timing=str(speech_timing), other="other strings")

我在我的前端(网页)中使用 Javascript 使用 Flask 方法,如下所示:

    $.post("/retrieve_speech", { text: "This is a test" }).done(function (data) {
        var audio_data= data.audio_data;
        var speech_timing = data.speech_timing;
        var other = data.other;

        // Decode base64 string to binary
        var binaryData = atob(audioData);

        // Create an array of 8-bit unsigned integers
        var byteArray = new Uint8Array(binaryData.length);
        for(var i = 0; i < binaryData.length; i++) {
            byteArray[i] = binaryData.charCodeAt(i);

        // Create a blob object from the byte array
        var blob = new Blob([byteArray], {type: 'audio/wav'});

        // Create a URL for the blob object
        var url = URL.createObjectURL(blob);

        // Play the audio
        var audio = new Audio(url);

现在的问题是音频没有播放。此外,在 Flask 应用程序中,我收到以下消息:

Numba: Attempted to fork from a non-main thread, the TBB library may be in an invalid state in the child process.

合成语音是有效的,所以问题一定是在 Flask 应用程序中转换为 wav 或字符串和/或在 Javascript 中解码字符串。


  • 确保
    data = base64.b64encode(result.audio_data).decode('utf-8') 
  • 使用
    ,因为您正在以 WAV 格式导出数据。

audio = AudioSegment.from_wav(io.BytesIO(result.audio_data))

    result = synthesizer.speak_text_async(text=text).get()

    if result.reason == speechsdk.ResultReason.SynthesizingAudioCompleted:
        # Convert MP3 data to base64 string
        data = base64.b64encode(result.audio_data).decode('utf-8')
        speech_timing = time.time() - start
    elif result.reason == speechsdk.ResultReason.Canceled:
        cancellation_details = result.cancellation_details
        if cancellation_details.reason == speechsdk.CancellationReason.Error:
            logging.error("Azure speech synthesis failed: {}".format(cancellation_details.error_details))

