保存到文件时,通过网络套接字的 Twilio Media Stream 非常静态

问题描述 投票:0回答:1

我有一个 python 程序,它从 Twilio 电话呼叫启动 websocket,并将该电话呼叫的音频保存到文件系统中的 WAV 文件中。有用!我的程序切换到 Websocket,并且我能够将音频缓冲到字节数组中并将该数组保存到 WAV 文件中。一切都很好,但是当我尝试播放该音频文件时,它非常静态,听起来质量非常低。我不确定这是否是网络套接字上音频流的实际质量,或者我在接收或保存音频时是否做错了什么。我将程序包含在这里。

# Program to accept a phone call via Twilio and save what the speaker says to a WAVE file
# Need to have ngrok, FastAPI and Twilio all setup properly

import os
import json
import base64
import wave
from fastapi import FastAPI, WebSocket, Request, WebSocketDisconnect
from fastapi.responses import HTMLResponse, Response
from jinja2 import Template

app = FastAPI()

# Global variables
wsserver = []

# Set the filename for writing the media stream
output_filename = "output.wav"

# Our buffer where we will queue up all the streamed audio
pcmu_data = bytearray()

# Default values, but will be overriden when we get our "start" message
channels = 1  # Mono
sample_width = 1  # 8-bit samples for PCMU
frame_rate = 8000  # Typical frame rate for PCMU

@app.websocket("/ws")
async def websocket_endpoint(websocket: WebSocket):
    global wsserver
    await websocket.accept()
    wsserver.append(websocket)
    while True:
        message = await websocket.receive_text()
        await on_message(websocket, message)

async def on_message(websocket, message):
    global wsserver
    global frame_rate
    global channels
    global pcmu_data
    global sample_width

    try:
        msg = json.loads(message)
        event = msg.get("event")

        if event == "connected":
            print("A new call has connected.")

        elif event == "start":
            print(f"Starting Media Stream {msg.get('streamSid')}")
            print(msg)

            # Override our default values with what our start message tells us
            channels = msg['start']['mediaFormat']['channels']
            frame_rate = msg['start']['mediaFormat']['sampleRate']

        # The event that carries our audio stream
        elif event == "media":
            payload = msg['media']['payload']
        
            if payload:
                # Decode base64-encoded media data
                media_bytes = base64.b64decode(payload)

                # Add the data onto the end of our byte array
                pcmu_data.extend(media_bytes)

        elif event == "stop":
            print("Call Has Ended")

            # How long was our mulaw stream
            frames = len(pcmu_data)

            # Create a WAV file with an initial number of frames set to 0
            with wave.open(output_filename, 'w') as wav_file:
                wav_file.setnchannels(channels)
                wav_file.setsampwidth(sample_width)
                wav_file.setframerate(frame_rate)
                wav_file.setnframes(frames)
                wav_file.writeframes(pcmu_data)

    except WebSocketDisconnect as e:
        print("WebSocket disconnectedL {e}")
        wsserver.remove(websocket)


@app.post("/")
async def post(request: Request):
    host = request.client.host
    print("Post call - host=" + host)
    xml = Template("""
    <Response>
        <Start>
            <Stream url="wss://83b4-73-70-107-57.ngrok-free.app/ws"/>
        </Start>
        <Say>Please state your message</Say>
        <Pause length="60" />
    </Response>
    """).render(host=host)
    return Response(content=xml, media_type="text/xml")

if __name__ == "__main__":
    print("Listening at Port 8080")
    import uvicorn
    uvicorn.run(app, host="0.0.0.0", port=8080)

这成功工作了,但是保存到文件系统的音频流非常静态。

websocket twilio audio-streaming
1个回答
0
投票

我从使用 wave 切换到使用 pywav 库。将代码编写改为以下内容,结果一目了然:

data_bytes = b"".join(pcmu_data)
wave_write = pywav.WavWrite(output_filename, 1, 8000, 8, 7)  # 1 stands for mono channel, 8000 sample rate, 8 bit, 7 stands for MULAW encoding
wave_write.write(data_bytes)
wave_write.close()
© www.soinside.com 2019 - 2024. All rights reserved.