我正在尝试使用文档中推荐的方法为视频添加字幕
from moviepy.editor import *
from moviepy.video.tools.subtitles import SubtitlesClip
from moviepy.video.io.VideoFileClip import VideoFileClip
generator = lambda txt: TextClip(txt, font='Georgia-Regular', fontsize=24, color='white')
sub = SubtitlesClip("Output.srt", generator)
myvideo = VideoFileClip("video.mp4")
final = CompositeVideoClip([myvideo, sub])
final.write_videofile("final.mp4", fps=myvideo.fps,threads = 4)
这将需要 2 个多小时来处理,但是当删除字幕时(如下所示)它运行不到一分钟,如果我遗漏了什么或者这是正常的,请告诉我,谢谢!
from moviepy.editor import *
from moviepy.video.tools.subtitles import SubtitlesClip
from moviepy.video.io.VideoFileClip import VideoFileClip
myvideo = VideoFileClip("video.mp4")
final = CompositeVideoClip([myvideo])
final.write_videofile("final.mp4", fps=myvideo.fps,threads = 4)
问题是我的“Output.srt”文件有错误的时间戳,远远超过视频长度。
作为参考,任何使用 pytube 下载字幕的人,请将位于 captions.py 模块中的 pytube 源代码中的“xml_caption_to_srt”方法替换为以下内容
def xml_caption_to_srt(self, xml_captions: str) -> str:
"""Convert xml caption tracks to "SubRip Subtitle (srt)".
:param str xml_captions:
XML formatted caption tracks.
"""
segments = []
root = ElementTree.fromstring(xml_captions)[1]
i = 0
for child in list(root):
if child.tag == 'p':
caption = ''
if len(list(child)) == 0:
continue
for s in list(child):
if s.tag == 's':
caption += ' ' + s.text
caption = unescape(caption.replace("\n", " ").replace(" ", " "), )
try:
duration = float(child.attrib["d"]) / 1000.0
except KeyError:
duration = 0.0
start = float(child.attrib["t"]) / 1000.0
end = start + duration
sequence_number = i + 1 # convert from 0-indexed to 1.
line = "{seq}\n{start} --> {end}\n{text}\n".format(
seq=sequence_number,
start=self.float_to_srt_time_format(start),
end=self.float_to_srt_time_format(end),
text=caption,
)
segments.append(line)
i += 1
return "\n".join(segments).strip()
使用该方法,您可以提取具有正确时间戳的字幕,如下所示
from pytube import YouTube
yt_transcript = YouTube('video_link')
caption = yt_transcript.captions['a.en']
en_caption_convert_to_srt =(caption.generate_srt_captions())
#save the caption to a file named Output.txt
text_file = open("Output.srt", "w")
text_file.write(en_caption_convert_to_srt)
text_file.close()