有什么方法可以在不先将文件写入内存的情况下执行此操作吗?

问题描述 投票:0回答:1

我有这段代码,可以根据

link
转录一些 YouTube 视频的音频流, 现在你可能会发现这很慢,因为我必须首先将视频流下载为
.mp4
,然后使用
moviepy
将其转换为
.wav
,然后录制音频,然后转录它。 我想要拥有完全相同的功能,但不需要首先通过将流的数据写入某个缓冲区来下载流。

from pytube import YouTube
import speech_recognition as sr
from moviepy.editor import *

video = YouTube(url=link)
audio_stream = video.streams.get_by_itag(140)

recognizer = sr.Recognizer()

audio_stream.download(filename="mp4_output.mp4")
audio = AudioFileClip("mp4_output.mp4")
audio.write_audiofile("wav_output.wav")

                    
with sr.AudioFile("./wav_output.wav") as audio_file:
    audio_data = recognizer.record(audio_file, duration=100)
    transcript = recognizer.recognize_sphinx(audio_data=audio_data)

我尝试过以下方法

from pytube import YouTube
import speech_recognition as sr

video = YouTube(url=link)
audio_stream = video.streams.get_by_itag(140)

buffer = io.BytesIO()
audio_stream.stream_to_buffer(buffer)
                    
recognizer = sr.Recognizer()

with sr.AudioFile(buffer) as audio_file:
    audio_data = recognizer.record(audio_file, duration=100)
    transcript = recognizer.recognize_sphinx(audio_data=audio_data) 

我收到以下错误

audio file could not be read as PCM WAV, AIFF/AIFF-C, or Native FLAC; check if file is corrupted or in another format
有没有办法将
buffer
转换为其中一种格式?

python io speech-recognition moviepy pytube
1个回答
0
投票

您可以使用

pydub
库,它可以帮助进行音频格式转换:

pip install pydub ffmpeg-python

然后,您可以修改脚本以将音频流从缓冲区转换为 WAV 格式,然后再与语音识别库一起使用:

from pytube import YouTube
import speech_recognition as sr
import io
from pydub import AudioSegment

# Your YouTube link
video = YouTube(url=link)
audio_stream = video.streams.get_by_itag(140)

buffer = io.BytesIO()
audio_stream.stream_to_buffer(buffer)

# Move back to the start of the BytesIO object
buffer.seek(0)

# Convert the audio stream to WAV format using pydub
audio_segment = AudioSegment.from_file(buffer, format="mp4")
wav_buffer = io.BytesIO()
audio_segment.export(wav_buffer, format="wav")

# Reset the buffer position to the start
wav_buffer.seek(0)

recognizer = sr.Recognizer()

with sr.AudioFile(wav_buffer) as audio_file:
    audio_data = recognizer.record(audio_file)
    transcript = recognizer.recognize_sphinx(audio_data=audio_data)
© www.soinside.com 2019 - 2024. All rights reserved.