所以我正在尝试创建一个使用 TTS 引擎读取一些文本的应用程序。由于文本太长无法显示在屏幕上,我将其分成较短的几行。我也在尝试为它创建一个 .srt 字幕,这样我就可以将视频、文本转语音音频和字幕组合成一个综合视频。 问题是我不知道如何计算字幕行的结束时间。我将提供我当前的代码,但效果不佳。
import pysrt
def process_text(text, max_line_length):
lines = []
line = ""
for word in text.split():
if len(line) + len(word) + 1 <= max_line_length:
line += f" {word}"
else:
lines.append(line.strip())
line = word
if line:
lines.append(line.strip())
return lines
def get_durations(lines, rate):
durations = []
start_time = pysrt.SubRipTime()
for line in lines:
end_time = int(len(line) / (rate / 60.0) * 1000)
end_time = start_time + pysrt.SubRipTime(milliseconds=end_time)
durations.append([start_time, end_time])
print(f'{start_time} {end_time}')
start_time = end_time
return durations
def generate_subtitle(text, max_line_length, rate, save_path):
subtitle_file = pysrt.SubRipFile()
lines = process_text(text, max_line_length);
durations = get_durations(lines, rate)
for i in range(len(lines)):
subtitle_file.append(pysrt.SubRipItem(index = i + 1, start = durations[i][0], end = durations[i][1], text = lines[i]))
subtitle_file.save(f'{save_path}/subtitles.srt')
return subtitle_file
输出文件看起来像这样:
1 00:00:00,000 --> 00:00:08,625 正在发生的事。
2 00:00:08,625 --> 00:00:18,750 SHYAMALAMADINGDONG THE STAR
3 00:00:18,750 --> 00:00:29,250 战争续集双重阴谋
但实际上第一行应该在 00:00:02 结束,而不是在 00:00:08 结束。