当运行spark流媒体应用程序从kinesis获取数据时,出现以下错误。
Exception in thread "Kinesis Receiver 0" java.lang.NoSuchFieldError: NO_INTS
at com.fasterxml.jackson.dataformat.cbor.CBORParser.<init>(CBORParser.java:285)
at com.fasterxml.jackson.dataformat.cbor.CBORParserBootstrapper.constructParser(CBORParserBootstrapper.java:91)
at com.fasterxml.jackson.dataformat.cbor.CBORFactory._createParser(CBORFactory.java:392)
at com.fasterxml.jackson.dataformat.cbor.CBORFactory.createParser(CBORFactory.java:308)
at com.fasterxml.jackson.dataformat.cbor.CBORFactory.createParser(CBORFactory.java:295)
at com.fasterxml.jackson.dataformat.cbor.CBORFactory.createParser(CBORFactory.java:26)
at com.fasterxml.jackson.databind.ObjectMapper.readTree(ObjectMapper.java:2294)
at com.amazonaws.protocol.json.JsonContent.parseJsonContent(JsonContent.java:72)
at com.amazonaws.protocol.json.JsonContent.<init>(JsonContent.java:64)
at com.amazonaws.protocol.json.JsonContent.createJsonContent(JsonContent.java:54)
at com.amazonaws.http.JsonErrorResponseHandler.handle(JsonErrorResponseHandler.java:89)
at com.amazonaws.http.JsonErrorResponseHandler.handle(JsonErrorResponseHandler.java:40)
at com.amazonaws.http.AwsErrorResponseHandler.handleAse(AwsErrorResponseHandler.java:53)
at com.amazonaws.http.AwsErrorResponseHandler.handle(AwsErrorResponseHandler.java:41)
at com.amazonaws.http.AwsErrorResponseHandler.handle(AwsErrorResponseHandler.java:26)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleErrorResponse(AmazonHttpClient.java:1781)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleServiceErrorResponse(AmazonHttpClient.java:1383)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1359)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1139)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:796)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:764)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:738)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:698)
at com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:680)
at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:544)
at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:524)
at com.amazonaws.services.kinesis.AmazonKinesisClient.doInvoke(AmazonKinesisClient.java:2809)
at com.amazonaws.services.kinesis.AmazonKinesisClient.invoke(AmazonKinesisClient.java:2776)
at com.amazonaws.services.kinesis.AmazonKinesisClient.invoke(AmazonKinesisClient.java:2765)
at com.amazonaws.services.kinesis.AmazonKinesisClient.executeListShards(AmazonKinesisClient.java:1557)
at com.amazonaws.services.kinesis.AmazonKinesisClient.listShards(AmazonKinesisClient.java:1528)
at com.amazonaws.services.kinesis.clientlibrary.proxies.KinesisProxy.listShards(KinesisProxy.java:325)
at com.amazonaws.services.kinesis.clientlibrary.proxies.KinesisProxy.getShardList(KinesisProxy.java:440)
at com.amazonaws.services.kinesis.clientlibrary.lib.worker.KinesisShardSyncer.getShardList(KinesisShardSyncer.java:349)
at com.amazonaws.services.kinesis.clientlibrary.lib.worker.KinesisShardSyncer.syncShardLeases(KinesisShardSyncer.java:159)
at com.amazonaws.services.kinesis.clientlibrary.lib.worker.KinesisShardSyncer.checkAndCreateLeasesForNewShards(KinesisShardSyncer.java:112)
at com.amazonaws.services.kinesis.clientlibrary.lib.worker.ShardSyncTask.call(ShardSyncTask.java:84)
at com.amazonaws.services.kinesis.clientlibrary.lib.worker.MetricsCollectingTaskDecorator.call(MetricsCollectingTaskDecorator.java:49)
at com.amazonaws.services.kinesis.clientlibrary.lib.worker.Worker.initialize(Worker.java:683)
at com.amazonaws.services.kinesis.clientlibrary.lib.worker.Worker.run(Worker.java:614)
at org.apache.spark.streaming.kinesis.KinesisReceiver$$anon$1.run(KinesisReceiver.scala:191)
下面是执行的命令。
spark-submit --jars spark-streaming-kinesis-asl_2.11-2.4.5.jar,amazon-kinesis-client-1.13.2.jar,aws-java-sdk-kinesis-1.11.745.jar,aws-java-sdk-core-1.11.745.jar,aws-java-sdk-sts-1.11.745.jar,aws-java-sdk-1.11.745.jar,aws-java-sdk-dynamodb-1.11.745.jar,aws-java-sdk-cloudwatch-1.11.745.jar,jackson-core-2.9.8.jar,jackson-dataformat-cbor-2.9.8.jar,jackson-databind-2.9.8.jar snowplow_spark/src/main.py
和代码是非常基本的。
kinesisStream = KinesisUtils.createStream(
ssc, kinesisAppName=appName, streamName=streamName, endpointUrl=endpointUrl,
regionName=regionName, initialPositionInStream=InitialPositionInStream.LATEST,
checkpointInterval=10)
我已经卡在这几天,不知道该怎么做。我知道某个地方的 jackson 版本在 spark 和 aws-sdk 中不匹配,但不知道该把哪个放在 --jars 中。
不知道你是否已经解决了这个问题,但是...。https:/issues.apache.orgjirabrowseSPARK-25455。 问题与此有关。我也遇到了同样的问题。我在pom.xml中使用下面的依赖关系让它工作了。另外,在我的项目中,所有的aws库都是2.16.0版本。希望能帮到你。
<dependency>
<groupId>com.fasterxml.jackson.dataformat</groupId>
<artifactId>jackson-dataformat-cbor</artifactId>
<version>2.6.7</version>
</dependency>