如何防止SqsListener发送垃圾邮件错误日志

问题描述 投票:0回答:1

我有一个使用

io.awspring.cloud.sqs
SQS 监听器的 Spring Boot 应用程序

@SqsListener(value = "${spring.cloud.aws.sqs.queue-name}",)
public String processSesEvent(EmailSqsMessage message) {
  [code]
}

当应用程序部署到 Kubernetes 时,会出现大量相同错误日志的垃圾邮件(10 秒内约 10k)。日志:

Error polling for messages in queue [queue name]

java.util.concurrent.CompletionException: software.amazon.awssdk.core.exception.SdkClientException: Unable to execute HTTP request: Connection refused: sqs.us-east-1.amazonaws.com/3.239.232.173:443
    at software.amazon.awssdk.utils.CompletableFutureUtils.errorAsCompletionException(CompletableFutureUtils.java:65)
    at software.amazon.awssdk.core.internal.http.pipeline.stages.AsyncExecutionFailureExceptionReportingStage.lambda$execute$0(AsyncExecutionFailureExceptionReportingStage.java:51)
    at java.base/java.util.concurrent.CompletableFuture.uniHandle(Unknown Source)
    at java.base/java.util.concurrent.CompletableFuture$UniHandle.tryFire(Unknown Source)
    at java.base/java.util.concurrent.CompletableFuture.postComplete(Unknown Source)
    at java.base/java.util.concurrent.CompletableFuture.completeExceptionally(Unknown Source)
    at software.amazon.awssdk.utils.CompletableFutureUtils.lambda$forwardExceptionTo$0(CompletableFutureUtils.java:79)
    at java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(Unknown Source)
    at java.base/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(Unknown Source)
    at java.base/java.util.concurrent.CompletableFuture.postComplete(Unknown Source)
    at java.base/java.util.concurrent.CompletableFuture.completeExceptionally(Unknown Source)
    at software.amazon.awssdk.core.internal.http.pipeline.stages.AsyncRetryableStage$RetryingExecutor.maybeAttemptExecute(AsyncRetryableStage.java:103)
    at software.amazon.awssdk.core.internal.http.pipeline.stages.AsyncRetryableStage$RetryingExecutor.maybeRetryExecute(AsyncRetryableStage.java:184)
    at software.amazon.awssdk.core.internal.http.pipeline.stages.AsyncRetryableStage$RetryingExecutor.lambda$attemptExecute$1(AsyncRetryableStage.java:159)
    at java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(Unknown Source)
    at java.base/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(Unknown Source)
    at java.base/java.util.concurrent.CompletableFuture.postComplete(Unknown Source)
    at java.base/java.util.concurrent.CompletableFuture.completeExceptionally(Unknown Source)
    at software.amazon.awssdk.utils.CompletableFutureUtils.lambda$forwardExceptionTo$0(CompletableFutureUtils.java:79)
    at java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(Unknown Source)
    at java.base/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(Unknown Source)
    at java.base/java.util.concurrent.CompletableFuture.postComplete(Unknown Source)
    at java.base/java.util.concurrent.CompletableFuture.completeExceptionally(Unknown Source)
    at software.amazon.awssdk.core.internal.http.pipeline.stages.MakeAsyncHttpRequestStage.lambda$null$0(MakeAsyncHttpRequestStage.java:103)
    at java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(Unknown Source)
    at java.base/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(Unknown Source)
    at java.base/java.util.concurrent.CompletableFuture.postComplete(Unknown Source)
    at java.base/java.util.concurrent.CompletableFuture.completeExceptionally(Unknown Source)
    at software.amazon.awssdk.core.internal.http.pipeline.stages.MakeAsyncHttpRequestStage.completeResponseFuture(MakeAsyncHttpRequestStage.java:240)
    at software.amazon.awssdk.core.internal.http.pipeline.stages.MakeAsyncHttpRequestStage.lambda$executeHttpRequest$3(MakeAsyncHttpRequestStage.java:163)
    at java.base/java.util.concurrent.CompletableFuture.uniHandle(Unknown Source)
    at java.base/java.util.concurrent.CompletableFuture$UniHandle.tryFire(Unknown Source)
    at java.base/java.util.concurrent.CompletableFuture$Completion.run(Unknown Source)
    at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
    at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
    at java.base/java.lang.Thread.run(Unknown Source)
Caused by: software.amazon.awssdk.core.exception.SdkClientException: Unable to execute HTTP request: Connection refused: sqs.us-east-1.amazonaws.com/3.239.232.173:443
    at software.amazon.awssdk.core.exception.SdkClientException$BuilderImpl.build(SdkClientException.java:111)
    at software.amazon.awssdk.core.exception.SdkClientException.create(SdkClientException.java:47)
    at software.amazon.awssdk.core.internal.http.pipeline.stages.utils.RetryableStageHelper.setLastException(RetryableStageHelper.java:223)
    at software.amazon.awssdk.core.internal.http.pipeline.stages.utils.RetryableStageHelper.setLastException(RetryableStageHelper.java:218)
    at software.amazon.awssdk.core.internal.http.pipeline.stages.AsyncRetryableStage$RetryingExecutor.maybeRetryExecute(AsyncRetryableStage.java:182)
    ... 23 common frames omitted
Caused by: io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: sqs.us-east-1.amazonaws.com/3.239.232.173:443
Caused by: java.net.ConnectException: Connection refused
    at java.base/sun.nio.ch.Net.pollConnect(Native Method)
    at java.base/sun.nio.ch.Net.pollConnectNow(Unknown Source)
    at java.base/sun.nio.ch.SocketChannelImpl.finishConnect(Unknown Source)
    at io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:337)
    at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:335)
    at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:776)
    at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:724)
    at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:650)
    at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:562)
    at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
    at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
    at java.base/java.lang.Thread.run(Unknown Source)

在错误爆发之后,它们会停止,并且 SQS 侦听器按预期工作。

如何防止此类垃圾邮件的发生?

我尝试实现一个健康指标并将其包含在 Kubernetes 中:

@Component
@AllArgsConstructor
public class SqsHealthIndicator implements HealthIndicator {
  private final SqsAsyncClient sqsAsyncClient;
  @Value("${spring.cloud.aws.sqs.ses-event-queue}")
  private String queueName;

  @Override
  public Health health() {
    try {
      sqsAsyncClient.getQueueUrl(GetQueueUrlRequest.builder().queueName(queueName).build()).get();
      return Health.up().build();
    } catch (Exception e) {
      return Health.down(e).build();
    }
  }
}

application.yml

management:
  endpoint:
    health:
      show-details: always
      group:
        liveness:
          include:
            - livenessState
            - sqs

deployment.yaml

startupProbe:
  httpGet:
    path: /actuator/health/liveness
    port: 8080
  failureThreshold: 10
  periodSeconds: 5
  initialDelaySeconds: 20

这确实在活跃端点中包含了运行状况指示器,但它并没有解决问题。

添加了一行来记录健康检查:

var result = sqsAsyncClient.getQueueUrl(GetQueueUrlRequest.builder().queueName(queueName).build()).get();
log.info("sqsAsync call result: {}", result);

查看日志,似乎对该端点的调用已成功,并且在错误日志的垃圾邮件发生之前,活性检查已通过。

该服务使用 istio 代理作为 sidecar,因此有一种理论认为 spring boot 应用程序在 istio pod 准备好之前就已启动,这会导致连接问题,所以我也尝试按照建议添加注释here,但是没有效果。

编辑:经过更多调查后,垃圾日志似乎发生在正在关闭的 Pod 上,而不是在部署时创建的 Pod 上。

spring-boot kubernetes spring-boot-actuator sqslistener
1个回答
0
投票

问题在于

istio-proxy
容器在 Spring Boot 应用程序之前关闭,导致外部服务失去视线。通过向
preStop
容器添加
istio-proxy
钩子解决了该问题。

lifecycle:
  preStop:
    exec:
      command: [ "/bin/sh", "-c", "sleep 30" ]
© www.soinside.com 2019 - 2024. All rights reserved.