如何添加sbt依赖性来实现pyspark和flume的整合?

问题描述 投票:0回答:1

我已经代表我尝试了很多次,但我一次又一次地面临这个问题,有人能帮助我为pyspark和flume集成添加sbt依赖,下面是我的代码。

spark-submit --packages 'org.apache.spark:spark-streaming-flume-assembly_2.12:2.4.5' spark_flume.py 
Ivy Default Cache set to: /home/hduser/.ivy2/cache
The jars for the packages stored in: /home/hduser/.ivy2/jars
:: loading settings :: url = jar:file:/usr/local/spark/jars/ivy-2.4.0.jar!/org/apache/ivy/core/settings/ivysettings.xml
org.apache.spark#spark-streaming-flume-assembly_2.12 added as a dependency
:: resolving dependencies :: org.apache.spark#spark-submit-parent-ab867e8f-f121-4402-a63c-942bac3932c1;1.0
    confs: [default]
    found org.apache.spark#spark-streaming-flume-assembly_2.12;2.4.5 in central
    found org.spark-project.spark#unused;1.0.0 in central
:: resolution report :: resolve 812ms :: artifacts dl 15ms
    :: modules in use:
    org.apache.spark#spark-streaming-flume-assembly_2.12;2.4.5 from central in [default]
    org.spark-project.spark#unused;1.0.0 from central in [default]
    ---------------------------------------------------------------------
    |                  |            modules            ||   artifacts   |
    |       conf       | number| search|dwnlded|evicted|| number|dwnlded|
    ---------------------------------------------------------------------
    |      default     |   2   |   0   |   0   |   0   ||   2   |   0   |
    ---------------------------------------------------------------------
:: retrieving :: org.apache.spark#spark-submit-parent-ab867e8f-f121-4402-a63c-942bac3932c1
    confs: [default]
    0 artifacts copied, 2 already retrieved (0kB/18ms)
20/05/15 15:35:18 WARN Utils: Your hostname, localhost.localdomain resolves to a loopback address: 127.0.0.1; using 192.168.19.137 instead (on interface ens33)
20/05/15 15:35:18 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address
20/05/15 15:35:19 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
  File "/home/hduser/pyspark_data1/spark_stream1/spark_flume.py", line 6
    artifactID=spark-streaming-flume_2.12
                                        ^
SyntaxError: invalid syntax
log4j:WARN No appenders could be found for logger (org.apache.spark.util.ShutdownHookManager).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
[hduser@localhost spark_stream1]$ 
python pyspark data-science data-analysis flume
1个回答
0
投票

这是一个 SyntaxError 在你 spark_flume.py 第6行的文件,涉及

    artifactID=spark-streaming-flume_2.12

我相信你需要把 spark-streaming-flume_2.12 作为字符串 "..."

© www.soinside.com 2019 - 2024. All rights reserved.