EDITI:通过删除应用程序中“setMaster”的conf设置,我能够成功运行yarn-cluster - 如果有人可以帮助spark master作为集群部署 - 那会很棒
我正在尝试在本地测试机器上设置 Spark,以便我可以从 s3 存储桶中读取数据,然后写回其中。
使用客户端运行 jar/应用程序工作得很好,很好,很好,因为它会进入存储桶并创建一个文件,然后再次返回。
但是,我需要它在集群模式下工作,以便它更接近我们的产品环境,但它不断失败 - 我可以看到日志中没有真正明智的消息,并且几乎没有反馈可以继续。
非常感谢任何帮助 - 我对 Spark/hadoop 很陌生,所以可能忽略了一些明显的东西。
我还尝试使用yarn-cluster作为主节点运行,但由于不同的原因而失败(说它找不到s3Native类 - 我将其作为jar传递)
这是一个 Windows ev
我正在运行的命令:
c:\>spark-submit --jars="C:\Spark\hadoop\share\hadoop\common\lib\hadoop-aws-2.7.1.jar,C:\Spark\hadoop\share\hadoop\common\lib\aws-java-sdk-1.7.4.jar" --verbose --deploy-mode cluster --master spark://127.0.0.1:7077 --class FileInputRename c:\sparkSubmit\sparkSubmit_NoJarSetInConf.jar "s3://bucket/jar/fileInputRename.txt"
控制台上的输出是:
Using properties file: C:\Spark\bin\..\conf\spark-defaults.conf
Parsed arguments:
master spark://127.0.0.1:7077
deployMode cluster
executorMemory null
executorCores null
totalExecutorCores null
propertiesFile C:\Spark\bin\..\conf\spark-defaults.conf
driverMemory null
driverCores null
driverExtraClassPath null
driverExtraLibraryPath null
driverExtraJavaOptions null
supervise false
queue null
numExecutors null
files null
pyFiles null
archives null
mainClass FileInputRename
primaryResource file:/c:/sparkSubmit/sparkSubmit_NoJarSetInConf.jar
name FileInputRename
childArgs [s3://SessionCam-Steve/jar/fileInputRename.txt]
jars file:/C:/Spark/hadoop/share/hadoop/common/lib/hadoop-aws-2.7.1.jar,file:/C:/Spark/hadoop/share/hadoop/common/lib/aws-java-sdk-1.7.4.jar
packages null
packagesExclusions null
repositories null
verbose true
Spark properties used, including those specified through
--conf and those from the properties file C:\Spark\bin\..\conf\spark-defaults.conf:
Running Spark using the REST application submission protocol.
Main class:
org.apache.spark.deploy.rest.RestSubmissionClient
Arguments:
file:/c:/sparkSubmit/sparkSubmit_NoJarSetInConf.jar
FileInputRename
s3://SessionCam-Steve/jar/fileInputRename.txt
System properties:
SPARK_SUBMIT -> true
spark.driver.supervise -> false
spark.app.name -> FileInputRename
spark.jars -> file:/C:/Spark/hadoop/share/hadoop/common/lib/hadoop-aws-2.7.1.jar,file:/C:/Spark/hadoop/share/hadoop/common/lib/aws-java-sdk-1.7.4.jar,file:/c:/sparkSubmit/sparkSubmit_NoJarSetInConf.jar
spark.submit.deployMode -> cluster
spark.master -> spark://127.0.0.1:7077
Classpath elements:
16/03/24 12:01:56 INFO rest.RestSubmissionClient: Submitting a request to launch an application in spark://127.0.0.1:7077.
又过了几秒钟,它显示了 c 提示符,没有其他任何内容。 8080 上的日志:
Application ID Name Cores Memory per Node Submitted Time User State Duration
app-20160324120221-0016 FileInputRename 1 1024.0 MB 2016/03/24 12:02:21 Administrator FINISHED 3 s
错误消息仅显示:
16/03/24 12:02:24 INFO spark.SecurityManager: Changing view acls to: Administrator
16/03/24 12:02:24 INFO spark.SecurityManager: Changing modify acls to: Administrator
16/03/24 12:02:24 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(Administrator); users with modify permissions: Set(Administrator)
如果我将yarn-cluster作为主要运行,那么这是我的命令:
c:>spark-submit --jars="C:\Spark\hadoop\share\hadoo