从Kinesis读取Pyspark中的数据

问题描述 投票:0回答:1

我正在尝试使用KinesisUtils.createStream从Pyspark的运动学中读取数据,但问题是我遇到了此错误。


  Spark Streaming's Kinesis libraries not found in class path. Try one of the following.

  1. Include the Kinesis library and its dependencies with in the
     spark-submit command as

     $ bin/spark-submit --packages org.apache.spark:spark-streaming-kinesis-asl:2.4.4 ...

  2. Download the JAR of the artifact from Maven Central http://search.maven.org/,
     Group Id = org.apache.spark, Artifact Id = spark-streaming-kinesis-asl-assembly, Version = 2.4.4.
     Then, include the jar in the spark-submit command as

     $ bin/spark-submit --jars <spark-streaming-kinesis-asl-assembly.jar> ...

________________________________________________________________________________________________


Traceback (most recent call last):
  File "/Users/ahmad.muhammad/Desktop/kinesis-reader.py", line 8, in <module>
    kinesisStream = KinesisUtils.createStream(ssc,'Ahmad-Kineses','twitter-stream','https://kinesis.us-east-1.amazonaws.com/','us-east-1',InitialPositionInStream.TRIM_HORIZON,20)
  File "/Users/Ahmad.Muhammad/opt/apache-spark/spark-2.4.4-bin-hadoop2.7/python/lib/pyspark.zip/pyspark/streaming/kinesis.py", line 84, in createStream
TypeError: 'JavaPackage' object is not callable
python apache-spark pyspark spark-streaming amazon-kinesis
1个回答
0
投票

假设您正在本地计算机上使用pyspark,那么您可以做的就是在代码中添加env变量,您可以执行以下操作。在您的终端中尝试

export PYSPARK_SUBMIT_ARGS = --master local[2] --packages org.apache.spark:spark-streaming-kinesis-asl_2.11:2.1.0 pyspark-shell

希望这可以解决您的问题。

© www.soinside.com 2019 - 2024. All rights reserved.