我正在使用 Whisper 进行转录,但出现以下错误,无法找出问题所在

问题描述 投票:0回答:1

我正在使用 Whisper 转录音频文件。我已经安装了Python3.9、ffmpeg和相关依赖项,以及openai-whisper==20230308。我可以导入耳语,但是当我尝试运行转录时:

audio = f'audio_dataset/testaudio.wav'
model = whisper.load_model("base") 

result = model.transcribe(audio, fp16=False,language="en")

...我收到以下错误:

ValueError                                Traceback (most recent call last)
Cell In [32], line 1
----> 1 result = model.transcribe(audio, fp16=False, language="en")

File ~/.local/lib/python3.9/site-packages/torch/autograd/grad_mode.py:27, in _DecoratorContextManager.__call__.<locals>.decorate_context(*args, **kwargs)
     24 @functools.wraps(func)
     25 def decorate_context(*args, **kwargs):
     26     with self.clone():
---> 27         return func(*args, **kwargs)

File ~/.local/lib/python3.9/site-packages/whisper/decoding.py:811, in decode(model, mel, options, **kwargs)
    808 if kwargs:
    809     options = replace(options, **kwargs)
--> 811 result = DecodingTask(model, options).run(mel)
    813 return result[0] if single else result

File ~/.local/lib/python3.9/site-packages/whisper/decoding.py:522, in DecodingTask.__init__(self, model, options)
    520 self.initial_tokens: Tuple[int] = self._get_initial_tokens()
    521 self.sample_begin: int = len(self.initial_tokens)
--> 522 self.sot_index: int = self.initial_tokens.index(tokenizer.sot)
    524 # inference: implements the forward pass through the decoder, including kv caching
    525 self.inference = PyTorchInference(model, len(self.initial_tokens))

ValueError: tuple.index(x): x not in tuple

有人遇到过类似的问题吗?

python-3.x ffmpeg speech-to-text valueerror openai-whisper
1个回答
0
投票

我也遇到同样的问题。
我正在使用 whisper-openai 包。

这对我有用:

pip uninstall whisper-openai 

pip install git+https://github.com/openai/whisper.git 

© www.soinside.com 2019 - 2024. All rights reserved.