spark aws 抛出 java.lang.NoSuchMethodError

问题描述 投票:0回答:1

我们使用 Spark 2.4 版本的独立集群。我们需要将AWS SDK版本升级到1.12.654(尚未准备好升级到AWS V2),因此我们升级到hadoop 3.1.1。在遇到很多依赖项问题后,我们陷入了 java.lang.NoSuchMethodError 错误,我们无法弄清楚它。

例外是

Caused by: java.lang.NoSuchMethodError: com.amazonaws.http.HttpResponse.getHttpRequest()Lcom/amazonaws/thirdparty/apache/http/client/methods/HttpRequestBase;
    at com.amazonaws.services.s3.internal.S3ObjectResponseHandler.handle(S3ObjectResponseHandler.java:57)
    at com.amazonaws.services.s3.internal.S3ObjectResponseHandler.handle(S3ObjectResponseHandler.java:29)
    at com.amazonaws.http.response.AwsResponseHandlerAdapter.handle(AwsResponseHandlerAdapter.java:69)
    at com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleResponse(AmazonHttpClient.java:1794)
    at com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleSuccessResponse(AmazonHttpClient.java:1477)
    at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1384)
    at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1157)
    at com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:814)
    at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:781)
    at com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:755)
    at com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:715)
    at com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:697)
    at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:561)
    at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:541)
    at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5520)
    at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5467)
    at com.amazonaws.services.s3.AmazonS3Client.getObject(AmazonS3Client.java:1554)
    at org.apache.hadoop.fs.s3a.S3AInputStream.lambda$reopen$0(S3AInputStream.java:183)
    at org.apache.hadoop.fs.s3a.Invoker.once(Invoker.java:109)
    at org.apache.hadoop.fs.s3a.S3AInputStream.reopen(S3AInputStream.java:182)
    at org.apache.hadoop.fs.s3a.S3AInputStream.lambda$lazySeek$1(S3AInputStream.java:328)
    at org.apache.hadoop.fs.s3a.Invoker.lambda$retry$2(Invoker.java:190)
    at org.apache.hadoop.fs.s3a.Invoker.once(Invoker.java:109)
    at org.apache.hadoop.fs.s3a.Invoker.lambda$retry$3(Invoker.java:260)
    at org.apache.hadoop.fs.s3a.Invoker.retryUntranslated(Invoker.java:317)
    at org.apache.hadoop.fs.s3a.Invoker.retry(Invoker.java:256)
    at org.apache.hadoop.fs.s3a.Invoker.retry(Invoker.java:188)
    at org.apache.hadoop.fs.s3a.Invoker.retry(Invoker.java:210)
    at org.apache.hadoop.fs.s3a.S3AInputStream.lazySeek(S3AInputStream.java:321)
    at org.apache.hadoop.fs.s3a.S3AInputStream.read(S3AInputStream.java:433)
    at java.io.DataInputStream.read(DataInputStream.java:149)
    at org.apache.hadoop.mapreduce.lib.input.UncompressedSplitLineReader.fillBuffer(UncompressedSplitLineReader.java:62)
    at org.apache.hadoop.util.LineReader.readDefaultLine(LineReader.java:218)
    at org.apache.hadoop.util.LineReader.readLine(LineReader.java:176)
    at org.apache.hadoop.mapreduce.lib.input.UncompressedSplitLineReader.readLine(UncompressedSplitLineReader.java:94)
    at org.apache.hadoop.mapreduce.lib.input.LineRecordReader.initialize(LineRecordReader.java:124)
    at org.apache.spark.sql.execution.datasources.HadoopFileLinesReader.<init>(HadoopFileLinesReader.scala:65)
    at org.apache.spark.sql.execution.datasources.HadoopFileLinesReader.<init>(HadoopFileLinesReader.scala:47)
    at org.apache.spark.sql.execution.datasources.csv.TextInputCSVDataSource$.readFile(CSVDataSource.scala:199)
    at org.apache.spark.sql.execution.datasources.csv.CSVFileFormat$$anonfun$buildReader$2.apply(CSVFileFormat.scala:142)
    at org.apache.spark.sql.execution.datasources.csv.CSVFileFormat$$anonfun$buildReader$2.apply(CSVFileFormat.scala:136)
    at org.apache.spark.sql.execution.datasources.FileFormat$$anon$1.apply(FileFormat.scala:148)
    at org.apache.spark.sql.execution.datasources.FileFormat$$anon$1.apply(FileFormat.scala:132)

当我们查看 jar 版本时,我们发现 S3ObjectResponseHandler 来自 aws-java-sdk-s3-1.12.654.jar,调用来自 aws-java-sdk-core-1.12.654.jar 的 HttpResponse。

当我查看 HttpResponse 类中的导入时,它有

import org.apache.http.client.methods.HttpRequestBase;

所以该类似乎来自 HttpClients 依赖项,但在

com/amazonaws/thirdparty/apache/http/client/methods/HttpRequestBase
处抛出异常,我在 aws-java-sdk-bundle-1.12.654.jar 中找到了它。我不太确定如何解决它。我已将 httpclients 从依赖项中排除,尝试强制使用 aws-java-sdk-bundle。我被困住了一周了。不知道我错过了什么。

任何指示将不胜感激。

更新。

我在下面的 aws 捆绑 pom 文件中看到。我猜测是因为它显示包名称为 com.amazonaws.thirdparty.apache.http , 当抛出异常时。但我仍然不确定为什么当 aws 包位于类路径中时找不到该方法

<relocations>
                        <relocation>
                            <pattern>org.apache.http</pattern>
                            <shadedPattern>com.amazonaws.thirdparty.apache.http</shadedPattern>
                        </relocation>
                    </relocations>

谢谢。

amazon-web-services apache-spark httpclient aws-sdk
1个回答
0
投票

hadoop-aws 使用“aws-java-sdk-bundle”jar 作为其依赖项,以避免 aws 工件内出现不一致问题。做同样的事情。

参见:https://mvnrepository.com/artifact/org.apache.hadoop/hadoop-aws

也建议升级到 hadoop 3.3.6 版本,对于所有 hadoop-* JAR 也是如此。

© www.soinside.com 2019 - 2024. All rights reserved.