我一直在使用 Python 中的 spleeter 库从音频文件中分离人声,在处理预先录制的音频文件时效果很好。但是,我正在尝试使用带有 PyAudio 的 spleeter 来实现实时音频人声去除,但它似乎没有按预期工作。我已经编写了以下代码,但它没有产生所需的输出。我需要专家的帮助来解决问题。
from spleeter.separator import Separator
import multiprocessing
import pyaudio
import numpy as np
# Global variables
CHUNK_SIZE = 1024
SAMPLING_RATE = 16000
THIN_FACTOR = 0.5
vocals_data = bytes()
# Create PyAudio object
p = pyaudio.PyAudio()
# Define callback function for audio processing
def process_audio(in_data, frame_count, time_info, status):
global vocals_data
# Convert input data to numpy array
audio_array = np.frombuffer(in_data, dtype=np.int16)
# Perform vocal removal on the audio input
# Pass the audio array as waveform to separate() method
vocals = Separator('spleeter:2stems').separate(audio_array)
# Convert vocals to audio data
vocals_data = vocals['vocals'].flatten().astype(np.int16).tobytes()
# Return processed data for output
return vocals_data, pyaudio.paContinue
# Open stream for recording
stream = p.open(format=pyaudio.paInt16,
channels=1,
rate=SAMPLING_RATE,
input=True,
output=True, # Set output to True for an output stream
frames_per_buffer=CHUNK_SIZE,
stream_callback=process_audio)
# Start stream
stream.start_stream()
# Create stream for playback
playback_stream = p.open(format=pyaudio.paInt16,
channels=1,
rate=SAMPLING_RATE,
output=True)
# Play processed data in real-time
while stream.is_active():
if len(vocals_data) >= CHUNK_SIZE:
playback_stream.write(vocals_data[:CHUNK_SIZE])
vocals_data = vocals_data[CHUNK_SIZE:]
# Stop streams
stream.stop_stream()
stream.close()
playback_stream.stop_stream()
playback_stream.close()
# Terminate PyAudio object
p.terminate()
if __name__ == '__main__':
multiprocessing.freeze_support()
我尝试使用 spleeter 和 PyAudio 实现实时音频人声去除。 我希望代码能够实时将人声与音频输入分开,并毫无问题地播放处理后的音频。
我也尝试过使用 multiprocessing.freeze_support() 但它没有解决问题。 任何帮助或建议将不胜感激!