从EMR写入DSE图

问题描述 投票:0回答:2

我们正在尝试编写来从EMR写入DSE图(cassandra)并继续获取这些错误。我的JAR是一个带有byos依赖关系的阴影jar。任何帮助,将不胜感激。

java.lang.UnsatisfiedLinkError: org.apache.cassandra.utils.NativeLibraryLinux.getpid()J
    at org.apache.cassandra.utils.NativeLibraryLinux.getpid(Native Method)
    at org.apache.cassandra.utils.NativeLibraryLinux.callGetpid(NativeLibraryLinux.java:124)
    at org.apache.cassandra.utils.NativeLibrary.getProcessID(NativeLibrary.java:429)
    at org.apache.cassandra.utils.UUIDGen.hash(UUIDGen.java:386)
    at org.apache.cassandra.utils.UUIDGen.makeNode(UUIDGen.java:367)
    at org.apache.cassandra.utils.UUIDGen.makeClockSeqAndNode(UUIDGen.java:300)
    at org.apache.cassandra.utils.UUIDGen.<clinit>(UUIDGen.java:41)
    at com.datastax.bdp.graph.spark.sql.vertex.SimpleVertexIdAssigner$.simpleEdgeId(SimpleVertexIdAssigner.scala:19)
    at com.datastax.bdp.graph.spark.graphframe.DseGraphFrame$$anonfun$3.apply(DseGraphFrame.scala:417)
    at com.datastax.bdp.graph.spark.graphframe.DseGraphFrame$$anonfun$3.apply(DseGraphFrame.scala:416)
    at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown Source)
    at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
    at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$11$$anon$1.hasNext(WholeStageCodegenExec.scala:619)
    at org.apache.spark.sql.execution.columnar.CachedRDDBuilder$$anonfun$1$$anon$1.hasNext(InMemoryRelation.scala:131)
    at org.apache.spark.storage.memory.MemoryStore.putIterator(MemoryStore.scala:220)
    at org.apache.spark.storage.memory.MemoryStore.putIteratorAsValues(MemoryStore.scala:298)
    at org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:1165)
    at org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:1156)
    at org.apache.spark.storage.BlockManager.doPut(BlockManager.scala:1091)
    at org.apache.spark.storage.BlockManager.doPutIterator(BlockManager.scala:1156)
    at org.apache.spark.storage.BlockManager.getOrElseUpdate(BlockManager.scala:882)
    at org.apache.spark.rdd.RDD.getOrCompute(RDD.scala:335)
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:286)
    at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
    at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
    at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
    at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
    at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
    at org.apache.spark.scheduler.Task.run(Task.scala:121)
    at org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:402)
    at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:408)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)

19/04/26 12:55:49 WARN TaskSetManager: Lost task 0.0 in stage 5.0 (TID 18, ip-10-69-16-79.vpc.internal, executor 1): java.lang.NoClassDefFoundError: Could not initialize class org.apache.cassandra.utils.UUIDGen
    at com.datastax.bdp.graph.spark.sql.vertex.SimpleVertexIdAssigner$.simpleEdgeId(SimpleVertexIdAssigner.scala:19)
    at com.datastax.bdp.graph.spark.graphframe.DseGraphFrame$$anonfun$3.apply(DseGraphFrame.scala:417)
    at com.datastax.bdp.graph.spark.graphframe.DseGraphFrame$$anonfun$3.apply(DseGraphFrame.scala:416)
    at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown Source)
    at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
apache-spark amazon-emr datastax-enterprise-graph
2个回答
0
投票

通常在使用noexec属性挂载临时目录时会发生此类错误,该属性会阻止加载java驱动程序使用的本机库。通常的解决方法是使用-Djava.io.tmpdir=...标志将Java指向另一个位置以获取临时文件 - 此位置不应使用noexec标志安装。

附:不幸的是,我对EMR知之甚少


0
投票

原来是JNA问题。添加了JNA依赖项作为着色jar的一部分,它工作正常。

<dependency>
        <groupId>net.java.dev.jna</groupId>
        <artifactId>jna</artifactId>
        <version>4.2.2</version>
</dependency>
© www.soinside.com 2019 - 2024. All rights reserved.