我正在尝试加载一个名为“audio”的字节类对象以作为 torchaudio 对象加载:
def convert_audio(audio, target_sr: int = 16000):
wav, sr = torchaudio.load(audio)
#(...) some other code
我无法在网上找到任何有关如何在 Torchaudio 中加载字节音频对象的说明文档,它似乎只接受路径字符串。但我必须在应用程序中保存 I/O,并且无法写入和加载 .wav 文件,只能直接处理音频对象。
对于这种情况,有人有建议吗?
如果我直接使用音频,则会收到此错误:
Exception has occurred: AttributeError (note: full exception trace is shown but execution is paused at: _run_module_as_main)
'bytes' object has no attribute 'seek'. You can only torch.load from a file that is seekable. Please pre-load the data into a buffer like io.BytesIO and try to load from it instead.
File "/home/felipe/.local/lib/python3.10/site-packages/torch/serialization.py", line 348, in _check_seekable
f.seek(f.tell())
使用 BytesIO:
Exception has occurred: UnpicklingError (note: full exception trace is shown but execution is paused at: _run_module_as_main)
invalid load key, '\x00'.
File "/home/felipe/.local/lib/python3.10/site-packages/torch/serialization.py", line 1002, in _legacy_load
magic_number = pickle_module.load(f, **pickle_load_args)
File "/home/felipe/.local/lib/python3.10/site-packages/torch/serialization.py", line 795, in load
return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args)
File "/home/felipe/Coding projects/silero/stt.py", line 35, in convert_audio
wav,sr = torch.load(io.BytesIO(audio))
File "/home/felipe/Coding projects/silero/stt.py", line 60, in transcribe
input = prepare_model_input(convert_audio(audio),
File "/home/felipe/Coding projects/silero/psgui.py", line 97, in <module>
transcripton = stt.transcribe('en',audio)
File "/usr/lib/python3.10/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main (Current frame)
return _run_code(code, main_globals, None,
如果是WAV格式,
torchaudio.load
应该能够从类似文件的对象中解码它。你的代码片段对我来说看起来不错。
以下教程使用不同的类文件对象演示了它。
https://pytorch.org/audio/0.13.0/tutorials/audio_io_tutorial.html#loading-from-file-like-object
尽管如此,它不起作用的原因有很多。例如,您的类文件对象的光标是否指向正确的位置(音频数据的开头)?
read
方法符合io.RawIOBase.read协议吗?
如果没有看到错误堆栈跟踪,很难判断。
您需要先将其更改为类文件对象
结果 = b'xxxxx'
wav_file_bytesIO = BytesIO(结果)
数据,sr = torchaudio.load(wav_file_bytesIO)