我正在使用 Whisper 转录音频文件。我已经安装了Python3.9、ffmpeg和相关依赖项,以及openai-whisper==20230308。我可以导入耳语,但是当我尝试运行转录时:
audio = f'audio_dataset/testaudio.wav'
model = whisper.load_model("base")
result = model.transcribe(audio, fp16=False,language="en")
...我收到以下错误:
ValueError Traceback (most recent call last)
Cell In [32], line 1
----> 1 result = model.transcribe(audio, fp16=False, language="en")
File ~/.local/lib/python3.9/site-packages/torch/autograd/grad_mode.py:27, in _DecoratorContextManager.__call__.<locals>.decorate_context(*args, **kwargs)
24 @functools.wraps(func)
25 def decorate_context(*args, **kwargs):
26 with self.clone():
---> 27 return func(*args, **kwargs)
File ~/.local/lib/python3.9/site-packages/whisper/decoding.py:811, in decode(model, mel, options, **kwargs)
808 if kwargs:
809 options = replace(options, **kwargs)
--> 811 result = DecodingTask(model, options).run(mel)
813 return result[0] if single else result
File ~/.local/lib/python3.9/site-packages/whisper/decoding.py:522, in DecodingTask.__init__(self, model, options)
520 self.initial_tokens: Tuple[int] = self._get_initial_tokens()
521 self.sample_begin: int = len(self.initial_tokens)
--> 522 self.sot_index: int = self.initial_tokens.index(tokenizer.sot)
524 # inference: implements the forward pass through the decoder, including kv caching
525 self.inference = PyTorchInference(model, len(self.initial_tokens))
ValueError: tuple.index(x): x not in tuple
有人遇到过类似的问题吗?
我也遇到同样的问题。
我正在使用 whisper-openai 包。
这对我有用:
pip uninstall whisper-openai
pip install git+https://github.com/openai/whisper.git