我目前正在尝试在 python 中创建一个类似调制解调器的脚本,该脚本使用声音通过 sounddevice 响应其自身的其他实例,有点像过去使用的真正调制解调器。
我已经开发了一些发送和回复功能,例如 DTMF 发生器和二进制转换器,但是我在检测某些频率(440hz + 350hz 又名拨号音)时遇到问题,这使我无法继续收听其他声音(DTMF 、数据等)并实时回复。
我对 sounddevice 和 numpy 也很陌生,只使用了其他用户为 opencv 提供的 numpy 代码。我只弄清楚如何在选定的时间内创建和播放选定的正弦波。对于接收部分,我主要使用 ChatGPT,但它的代码要么没有回复,要么始终返回错误,所以我决定尝试自己制作一个,但是(至少对我来说)文档没有意义对我来说,希望如此。
如果您可以通过 ChatGPT 给我的脚本以任何方式帮助我,这里是:
import sounddevice as sd
import numpy as np
# Parameters
target_frequencies = [440, 350] # Frequencies to detect (440Hz and 350Hz)
duration = 15 # Duration in seconds
sample_rate = 44100 # Sample rate
# Callback function for audio input
def audio_callback(indata, frames, time, status):
# Convert audio data to mono
mono_data = np.mean(indata, axis=1)
# Compute the Fast Fourier Transform (FFT)
fft_data = np.fft.fft(mono_data)
freqs = np.fft.fftfreq(len(fft_data), 1 / sample_rate)
# Find the indices of the target frequencies
target_indices = [np.argmin(np.abs(freqs - freq)) for freq in target_frequencies]
# Check if the target frequencies are present
if all(abs(fft_data[index]) > 10000 for index in target_indices):
print("yo yo yo")
# Start recording
with sd.InputStream(callback=audio_callback, channels=2, samplerate=sample_rate):
print("Listening for tones...")
sd.sleep(int(duration * 1000)) # Record for the desired duration
print("Recording finished")
另外,请至少向我解释一下 InputStream 是如何工作的以及我如何从中检测声音。
谢谢!
我希望从 ChatGPT 所产生的内容中,您已经学会不要相信它用于编程应用程序。
对于您的应用程序来说,检测最大光谱分量是不够的,而且当然也不足以检测覆盖层、任意值 10,000 之上的任何此类分量。相反,您需要某种启发式方法来将您关心的带内频谱能量与总能量进行比较,如果超过阈值,则您的音调被视为存在。 (此外,您需要检查总能量,以区分您环境中的“某些声音”和“背景噪音”;我尚未展示这一点。)
FFT 根据样本大小和采样频率进行许多权衡。您不希望分辨率太低,否则您将无法区分好频率和坏频率。您不希望它太高,否则每个块将花费比它需要的时间更长的时间来捕获(并且占用比它需要的更多的内存)。您不希望样本量太小,否则您会错过最低频率。您不希望样本量太大,否则您将花费太长时间来捕获样本并且无法尽快做出响应。
在这种情况下,频率桶大小的合理值为 10 Hz,因为您感兴趣的两个频率的最大公因数是 10,这足以将这些音调与 DTMF/POTS 系统中的其他音调区分开来。
在麦克风上尝试之前,请先在维基百科的预设文件上尝试一下:
import librosa
import numpy as np
print('Loading canned tone...')
canned, rate = librosa.load('US_dial_tone.mp3', mono=True)
# Ignore next-highest DTMF tone of 697 Hz and up
# From the Precise Tone Plan (https://en.wikipedia.org/wiki/Precise_tone_plan),
# ignore 480 and 620 Hz
freq_bucket_size = 10 # greatest common factor of 350 and 440
n = rate//freq_bucket_size
target_frequencies = 350, 440
target_idx = [f//freq_bucket_size for f in target_frequencies]
print('Processing...')
canned = canned[:len(canned) - len(canned)%n]
for chunk in canned.reshape((-1, n)):
ampl = np.abs(np.fft.rfft(chunk))
total_energy = ampl[1:].sum()
tone_energy = ampl[target_idx].sum()
match = tone_energy / total_energy
if match > 0.5:
print(f'Tone matched at {match:.1%} energy')
Loading canned tone...
Processing...
Tone matched at 90.4% energy
Tone matched at 91.3% energy
Tone matched at 89.7% energy
Tone matched at 90.5% energy
Tone matched at 91.1% energy
Tone matched at 91.0% energy
Tone matched at 89.7% energy
Tone matched at 90.8% energy
Tone matched at 92.1% energy
Tone matched at 90.8% energy
Tone matched at 89.7% energy
Tone matched at 90.1% energy
Tone matched at 92.5% energy
Tone matched at 91.7% energy
Tone matched at 89.8% energy
Tone matched at 91.3% energy
Tone matched at 92.1% energy
Tone matched at 90.8% energy
Tone matched at 91.2% energy
Tone matched at 91.3% energy
Tone matched at 92.5% energy
Tone matched at 91.8% energy
Tone matched at 90.3% energy
Tone matched at 93.1% energy
Tone matched at 93.2% energy
Tone matched at 91.3% energy
Tone matched at 91.8% energy
Tone matched at 93.2% energy
Tone matched at 92.6% energy
Tone matched at 92.9% energy
Tone matched at 93.3% energy
Tone matched at 93.8% energy
Tone matched at 94.1% energy
Tone matched at 92.2% energy
Tone matched at 92.9% energy
Tone matched at 93.3% energy
Tone matched at 93.8% energy
Tone matched at 93.6% energy
Tone matched at 93.0% energy
Tone matched at 94.0% energy
Tone matched at 92.1% energy
Tone matched at 91.9% energy
Tone matched at 93.4% energy
Tone matched at 93.2% energy
Tone matched at 91.3% energy
Tone matched at 92.9% energy
Tone matched at 92.8% energy
Tone matched at 91.1% energy
Tone matched at 91.4% energy
Tone matched at 90.1% energy