在 Swift 中将语音隔离与语音识别集成

问题描述 投票:0回答:0

我正在尝试将语音隔离与 Swift 中的语音识别集成在一起。我的目标是让 IOS 的内置语音识别具有更好的质量,因为我们都知道它可能有多不准确,尤其是在嘈杂的背景下。目前,我的语音识别工作正常,我创建了一个音频单元,允许对语音隔离进行语音处理,但我不知道如何从那里开始将音频单元与语音识别任务集成。我试着拼凑出互联网上的内容,但我仍然是 Swift 的初学者,真的不知道如何从那里开始。另外,如果有任何其他建议可以使语音识别更好,那就太好了。这是我现在拥有的代码:

对于语音识别,这就是我所拥有的并且工作正常:

let audioEngineSpeech = AVAudioEngine()
let speechRecognizer = SFSpeechRecognizer(locale: Locale(identifier: "en-US"))!
var recognitionRequest = SFSpeechAudioBufferRecognitionRequest()
var recognitionTask: SFSpeechRecognitionTask?

let inputNode = audioEngineSpeech.inputNode
inputNode.reset()
inputNode.removeTap(onBus: 0)
inputNode.isVoiceProcessingBypassed = true
let format = inputNode.inputFormat(forBus: 0)
inputNode.installTap(onBus: 0, bufferSize: 1024, format: format) { buffer, _ in
    recognitionRequest.append(buffer)
}

audioEngineSpeech.prepare()
audioEngineSpeech.start()
recognitionTask = speechRecognizer.recognitionTask(with: recognitionRequest) { result, error in
      let transcription = result.bestTranscription.formattedString
      // using transcription here...
}

我还创建了音频单元:

var desc = AudioComponentDescription(
    componentType: kAudioUnitType_Output,
    componentSubType: kAudioUnitSubType_VoiceProcessingIO,
    componentManufacturer: kAudioUnitManufacturer_Apple,
    componentFlags: 0,
    componentFlagsMask: 0
)

guard let component = AudioComponentFindNext(nil, &desc) else {
    fatalError("Unable to find audio component")
}
var audioUnit: AudioUnit?
let osErr = AudioComponentInstanceNew(component, &audioUnit)
print("os Error for set up audio unit:", osErr)

var enable: UInt32 = 1
AudioUnitSetProperty(audioUnit!,
    kAUVoiceIOProperty_VoiceProcessingEnableAGC,
    kAudioUnitScope_Global,
    0,
    &enable,
    UInt32(MemoryLayout<UInt32>.size))

AudioUnitSetProperty(audioUnit!,
    kAudioOutputUnitProperty_EnableIO,
    kAudioUnitScope_Input,
    0,
    &enable,
    UInt32(MemoryLayout<UInt32>.size))

AudioUnitSetProperty(audioUnit!,
    kAudioUnitSubType_AUSoundIsolation,
    kAudioUnitScope_Input,
    0,
    &enable,
    UInt32(MemoryLayout<UInt32>.size))

AudioUnitSetProperty(audioUnit!,
    kAUVoiceIOProperty_BypassVoiceProcessing,
    kAudioUnitScope_Global,
    0,
    &enable,
    UInt32(MemoryLayout<UInt32>.size))

AudioUnitSetProperty(audioUnit!,
    kAudioUnitProperty_ShouldAllocateBuffer,
    kAudioUnitScope_Output,
    0,
    &enable,
    UInt32(MemoryLayout<UInt32>.size))

let result = AudioUnitInitialize(audioUnit!)

在我的视图中,我还显示了 systemUserInterface 以提示用户选择语音隔离麦克风模式:

AVCaptureDevice.showSystemUserInterface(.microphoneModes)

任何帮助将不胜感激。

提前致谢。

swift avfoundation speech-recognition core-audio speech-to-text
© www.soinside.com 2019 - 2024. All rights reserved.