如何设置流式传输在 Node.js 中识别 Google Cloud Speech To Text V2?

问题描述 投票:0回答:1

我正在尝试在 Node.js 中设置 StreamingRecognize() Google Cloud Speech to Text V2 以流式传输音频数据,但在初始识别器请求设置流时,它总是向我抛出相同的错误:

Error: 3 INVALID_ARGUMENT: Invalid resource field value in the request.
    at callErrorFromStatus (/Users/<filtered>/backend/node_modules/@grpc/grpc-js/src/call.ts:81:17)
    at Object.onReceiveStatus (/Users/<filtered>/backend/node_modules/@grpc/grpc-js/src/client.ts:701:51)
    at Object.onReceiveStatus (/Users/<filtered>/backend/node_modules/@grpc/grpc-js/src/client-interceptors.ts:416:48)
    at /Users/<filtered>/backend/node_modules/@grpc/grpc-js/src/resolving-call.ts:111:24
    at processTicksAndRejections (node:internal/process/task_queues:77:11)
for call at
    at ServiceClientImpl.makeBidiStreamRequest (/Users/<filtered>/backend/node_modules/@grpc/grpc-js/src/client.ts:685:42)
    at ServiceClientImpl.<anonymous> (/Users/<filtered>/backend/node_modules/@grpc/grpc-js/src/make-client.ts:189:15)
    at /Users/<filtered>/backend/node_modules/@google-cloud/speech/build/src/v2/speech_client.js:318:29
    at /Users/<filtered>/backend/node_modules/google-gax/src/streamingCalls/streamingApiCaller.ts:71:19
    at /Users/<filtered>/backend/node_modules/google-gax/src/normalCalls/timeout.ts:54:13
    at StreamProxy.setStream (/Users/<filtered>/backend/node_modules/google-gax/src/streamingCalls/streaming.ts:204:20)
    at StreamingApiCaller.call (/Users/<filtered>/backend/node_modules/google-gax/src/streamingCalls/streamingApiCaller.ts:88:12)
    at /Users/<filtered>/backend/node_modules/google-gax/src/createApiCall.ts:118:26
    at processTicksAndRejections (node:internal/process/task_queues:95:5)

{
  code: 3,
  details: 'Invalid resource field value in the request.',
  metadata: Metadata {
    internalRepr: Map(2) {
      'google.rpc.errorinfo-bin' => [Array],
      'grpc-status-details-bin' => [Array]
    },
    options: {}
  },
  statusDetails: [
    ErrorInfo {
      metadata: [Object],
      reason: 'RESOURCE_PROJECT_INVALID',
      domain: 'googleapis.com'
    }
  ],
  reason: 'RESOURCE_PROJECT_INVALID',
  domain: 'googleapis.com',
  errorInfoMetadata: {
    service: 'speech.googleapis.com',
    method: 'google.cloud.speech.v2.Speech.StreamingRecognize'
  }
}

流设置过程有两个步骤 1. 发送识别器请求对象来告诉 google 对于以下音频要使用什么识别器(包括作为字符串的识别器对象的路径和一个可选的配置对象以覆盖识别器的某些选项)数据以字节为单位,2.相同的请求,没有配置,但有一个用于转录音频的音频缓冲区。

我没有发送音频数据,因为初始识别器请求总是失败。

如果有人可以帮助我解决这个问题,那就太好了,因为这似乎相当简单,如果您知道问题的根源,可能会非常明显。

我的猜测我犯了错误:

  1. 我在 Google Cloud 中错误配置了某些内容,但这似乎不太合理,因为除了流请求之外,其他所有内容都有效。
  2. 我错误地构建了请求对象。如果是这种情况,还请提供发送音频缓冲区的请求对象。

我已阅读 Google Cloud Speech to Text V2 文档,并尝试按照描述实现所有内容。最后它应该返回转录的音频。

  1. 在 Google Cloud 控制台中设置识别器。
  2. 检查是否启用了所有必需的 API。
  3. 检查服务帐户等是否具有正确的身份验证权限等
  4. 检查身份验证是否正常工作。

我还多次尝试实现streamingRecognize(),如下所示并有一些细微的变化:

public async initialize() {
    
    const recognizerName = `projects/${this.projectId}/locations/global/recognizers/_`;
    const transcriptionRequest = {
      recognizer: recognizerName,
      streaming_config: streamingConfig,
    };

    const stream = this.client
      .streamingRecognize()
      .on("data", function (response) {
        console.log(response);
      })
      .on("error", function (response) {
        console.log(response);
      });

    // Write request objects.
    stream.write(transcriptionRequest);
  }

我还尝试在recognizerName中使用多个recognizer_ids而不是“_”。我尝试了几种不同类型的transcriptionRequests,其中我省略了streaming_config或将其重命名为“config”。我已经三次检查了我的projectId,我还用它交换了项目编号而不是项目ID(在谷歌云控制台的主页上找到)。什么都没用,我总是收到同样的错误。

此外,我还尝试过使用 v2 制作一个正常的 createRecognizer 并识别请求,效果很好:

 // Creates a Recognizer: WORKS
  public async createRecognizer() {
    const recognizerRequest = {
      parent: `projects/${this.projectId}/locations/global`,
      recognizerId: "rclatest",
      recognizer: {
        languageCodes: ["en-US"],
        model: "telephony",
      },
    };

    const operation = await this.client.createRecognizer(recognizerRequest);
    const recognizer = operation[0].result;
    const recognizerName = recognizer; //.name;
    console.log(`Created new recognizer: ${recognizerName}`);
  }

  // Transcribes Audio: WORKS
  public async transcribeFile() {
    const recognizerName = `projects/${this.projectId}/locations/global/recognizers/${this.recognizerId}`;
    const content = fs.readFileSync(this.audioFilePath).toString("base64");
    const transcriptionRequest = {
      recognizer: recognizerName,
      config: {
        // Automatically detects audio encoding
        autoDecodingConfig: {},
      },
      content: content,
    };

    const response = await this.client.recognize(transcriptionRequest);
    for (const result of response[0].results) {
      console.log(`Transcript: ${result.alternatives[0].transcript}`);
    }
  }
node.js streaming google-speech-api google-cloud-speech google-speech-to-text-api
1个回答
-1
投票

您找到streamingRecognize变体的解决方案了吗?我现在面临和你一模一样的问题。

© www.soinside.com 2019 - 2024. All rights reserved.