Google Cloud 语音编码无效错误

问题描述 投票:0回答:1

当我生成 wav 文件并将其作为 base64 传递到 Google Cloud Speech 服务时,出现以下错误:

错误:13 内部:请求消息序列化失败:编码无效

我使用 recordrtc 库在我的 Angular 项目上生成 wav 文件。您可以在下面找到它的代码:

private record() {
  this.recorder = new RecordRTC.StereoAudioRecorder(this.stream, {
    mimeType: 'audio/wav',
    audioBitsPerSecond : 44100,
  });
  this.recorder.record();
}

stopRecording() {
  if (this.recorder) {
    this.recorder.stop((blob: any) => {
      if (this.startTime) {
        const wavName = encodeURIComponent('audio_' + new Date().getTime() + '.wav');
        this.stopMedia();
        this._recorded.next({ blob: blob, title: wavName });
      }
    }, () => {
      this.stopMedia();
      this._recordingFailed.next("Recording failed.");
    });
  }
}

this.audioRecordingService.getRecordedBlob().subscribe((data) => {
  this.audioBlob = data.blob;
  this.audioName = data.title;
  var reader = new FileReader();
  reader.readAsDataURL(data.blob);
  reader.onloadend = () => {
    this.audioBase64 = reader.result as string;
  }
  this.ref.detectChanges();
});

然后我将 audioBase64 变量作为参数传递给我的 firebase 函数,并调用识别函数,如下所示;

export const transcriptAudio = onCall(async (data) => {
  try {
    let userId = data.auth?.token?.uid;
    if(!userId)
      throw new HttpsError('unauthenticated', 'The user is not authenticated.');

    let audioBase64 = data.data.audioBase64 as string;

    const client = new SpeechClient();
    const response = await client.recognize(
      {
        audio: {content: audioBase64 },
        config: {
          encoding: "LINEAR16",
          sampleRateHertz: 44100,
          languageCode: "en-US"}});

    const transcription = response[0].results?.map(result => result.alternatives![0].transcript).join('\n');
  } catch (error) {
    console.log(`Transcription error: ${error}`);
  }
});

不知何故,base64 数据似乎已损坏,但我看不到其根本原因。任何帮助将不胜感激。

node.js angular google-cloud-functions google-speech-api
1个回答
0
投票

问题出在base64 文件上。其内容如下;

数据:音频/wav;base64,UklGRiTABgBXQVZFZm10IBAAAAABAAIAgLsAAADuAgAEABAAZGF0YQDABgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA

所以我决定用','分割它并得到第二部分如下;

this.audioBase64= this.audioBase64.split(',')[1];

英国GRiTABgBXQVZFZm10IBAAAAABAAIAgLsAAADuAgAEABAAZGF0YQDABgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA

现在它按预期工作了。显然,Google 语音服务期望使用 base64 而不使用“data:audio/wav;base64”。

现在我收到以下错误,但原来的问题已解决;

必须使用单通道(单声道)音频,但 WAV 标头指示 2 个通道

© www.soinside.com 2019 - 2024. All rights reserved.