使用 aws sdk 2x 从 s3 读取文件时连接超时

问题描述 投票:0回答:1

我有一个批处理应用程序,可以从亚马逊 s3 读取大文件。 MyS3 配置:

@Configuration
    public class S3Configuration {
        @Bean
        public S3Client s3Client() {
            return S3Client.builder()
                    .credentialsProvider(DefaultCredentialsProvider.create())
                    .region(Region.AP_EAST_1)
                    .overrideConfiguration(ClientOverrideConfiguration.builder().apiCallAttemptTimeout(Duration.ofHours(6)).build())
                    .build();
        }
    }

以及读取文件

GetObjectRequest getObjectRequest = GetObjectRequest.builder()
                .bucket(bucketName).key(key)
                .build();

ResponseInputStream<GetObjectResponse> getObjectResponseResponseInputStream = s3Client.getObject(getObjectRequest);

但是大约半小时后我收到连接超时错误。 附加堆栈跟踪

Caused by: java.net.SocketException: Connection reset
at java.base/sun.nio.ch.NioSocketImpl.implRead(NioSocketImpl.java:323) ~[na:na]
at java.base/sun.nio.ch.NioSocketImpl.read(NioSocketImpl.java:350) ~[na:na]
at java.base/sun.nio.ch.NioSocketImpl$1.read(NioSocketImpl.java:803) ~[na:na]
at java.base/java.net.Socket$SocketInputStream.read(Socket.java:966) ~[na:na]
at java.base/sun.security.ssl.SSLSocketInputRecord.read(SSLSocketInputRecord.java:484) ~[na:na]
at java.base/sun.security.ssl.SSLSocketInputRecord.readFully(SSLSocketInputRecord.java:467) ~[na:na]
at java.base/sun.security.ssl.SSLSocketInputRecord.decodeInputRecord(SSLSocketInputRecord.java:243) ~[na:na]
at java.base/sun.security.ssl.SSLSocketInputRecord.decode(SSLSocketInputRecord.java:181) ~[na:na]
at java.base/sun.security.ssl.SSLTransport.decode(SSLTransport.java:111) ~[na:na]
at java.base/sun.security.ssl.SSLSocketImpl.decode(SSLSocketImpl.java:1509) ~[na:na]
at java.base/sun.security.ssl.SSLSocketImpl.readApplicationRecord(SSLSocketImpl.java:1480) ~[na:na]
at java.base/sun.security.ssl.SSLSocketImpl$AppInputStream.read(SSLSocketImpl.java:1065) ~[na:na]
at org.apache.http.impl.io.SessionInputBufferImpl.streamRead(SessionInputBufferImpl.java:137) ~[httpcore-4.4.16.jar:4.4.16]
at org.apache.http.impl.io.SessionInputBufferImpl.read(SessionInputBufferImpl.java:197) ~[httpcore-4.4.16.jar:4.4.16]
at org.apache.http.impl.io.ContentLengthInputStream.read(ContentLengthInputStream.java:176) ~[httpcore-4.4.16.jar:4.4.16]
at org.apache.http.conn.EofSensorInputStream.read(EofSensorInputStream.java:135) ~[httpclient-4.5.13.jar:4.5.13]
at java.base/java.io.FilterInputStream.read(FilterInputStream.java:132) ~[na:na]
at software.amazon.awssdk.services.s3.checksums.ChecksumValidatingInputStream.read(ChecksumValidatingInputStream.java:112) ~[s3-2.20.144.jar:na]
at java.base/java.io.FilterInputStream.read(FilterInputStream.java:132) ~[na:na]
at software.amazon.awssdk.core.io.SdkFilterInputStream.read(SdkFilterInputStream.java:66) ~[sdk-core-2.20.144.jar:na]
at java.base/sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:270) ~[na:na]
at java.base/sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:313) ~[na:na]
at java.base/sun.nio.cs.StreamDecoder.read(StreamDecoder.java:188) ~[na:na]
at java.base/java.io.InputStreamReader.read(InputStreamReader.java:177) ~[na:na]
at java.base/java.io.BufferedReader.fill(BufferedReader.java:162) ~[na:na]
at java.base/java.io.BufferedReader.readLine(BufferedReader.java:329) ~[na:na]
at java.base/java.io.BufferedReader.readLine(BufferedReader.java:396) ~[na:na]
at org.springframework.batch.item.file.FlatFileItemReader.readLine(FlatFileItemReader.java:216) ~[spring-batch-infrastructure-5.0.3.jar:5.0.3]

我尝试过 apiCallAttemptTimeout、apiCallTimeout、retryPolicy 等,但对我来说没有任何效果。有人可以帮我解决这个问题吗?

amazon-web-services amazon-s3 spring-batch aws-sdk aws-sdk-java-2.0
1个回答
0
投票

超时问题可能是由于从文件中读取每批行后 ItemWriter 更新数据库造成的延迟造成的。要解决这个问题,您可以

a) 实现一个 ItemReader,它使用字节范围提取来读取块中的 s3 文件(请参阅 https://docs.aws.amazon.com/whitepapers/latest/s3-optimizing-performance-best-practices/use-byte -range-fetches.html);或

b)在一个步骤中将文件下载到本地临时文件(通过tasklet),然后在更新数据库的面向块的步骤中读取该本地临时文件。

© www.soinside.com 2019 - 2024. All rights reserved.