如何在 Python 中将 OpenAI 转录格式化为带有时间戳的 JSON 格式?

问题描述 投票:0回答:1

import openai


openai.organization = "org-xxxxxx"
openai.api_key = "sk-xxxxx"

audio_file_path =  "/Users/tejaksha/Downloads/dhoni.mp4"

# Note: you need to be using OpenAI Python v0.27.0 for the code below to work

audio_file= open(audio_file_path, "rb")
transcript = openai.Audio.transcribe("whisper-1", audio_file)

在上面的代码中我能够得到输出

{
    "text": "Flat back, just got a little tight to him, he was wagging for it, set up for the slower ball and punished it. The one's going straight down the ground. And MS Daini just taking control."
}

但是我想要以下带有时间戳的格式如何使用 OPENAI 转录?

我需要的实际格式是

{
  "transcript": [
    {
      "text": "[Music]",
      "start": 7.39,
      "duration": 4.1
    },
    {
      "text": "once upon a time",
      "start": 16.48,
      "duration": 4.4
    },
    {
      "text": "in ancient china there lived three",
      "start": 17.6,
      "duration": 6.64
    },
    {
      "text": "old monks their names are not remembered",
      "start": 20.88,
      "duration": 6.559
    }
  ]
}

python-3.x timestamp openai-api transcription openai-whisper
1个回答
0
投票

我认为OpenAI API不支持此类功能。但是,您可以使用 Whisper 库并返回时间戳。

import whisper
model = whisper.load_model("base")
audio = whisper.load_audio(ASRPage.output_file_path)
result = model.transcribe(audio)
print(result["segments"])

这确实意味着您需要自己的 GPU 或 PC 来运行推理。

© www.soinside.com 2019 - 2024. All rights reserved.