火花提交后输入字符串错误

问题描述 投票:0回答:1

我正在尝试运行一些Spark Scala代码:

import org.apache.spark.{SparkConf, SparkContext}
import scala.collection.mutable.ListBuffer

object EzRecoMRjobs {

  def main(args: Array[String]) {

    val conf = new SparkConf()
    conf.setMaster("local")
    conf.setAppName("Product Cardinalities")
    val sc = new SparkContext(conf)

    val dataset = sc.textFile(args(0))
    // Load parameters
    val customerIndex = args(1).toInt - 1
    val ProductIndex = args(2).toInt - 1
    val outputPath = args(3).toString


    val resu = dataset.map( line => { val orderId = line.split("\t")(0)
                                      val cols = line.split("\t")(1).split(";")
                                      cols(ProductIndex)
                                     })
        .map( x => (x,1) )
        .reduceByKey(_ + _)
        .saveAsTextFile(outputPath)

    sc.stop()

  }
}

此代码在Intellij中工作,并将结果写入“outputPath”文件夹。从我的Intellij项目中我生成了一个.jar文件,我想用spark-submit运行这段代码。所以在我的终端我推出:

spark-submit \
  --jars /Users/users/Documents/TestScala/ezRecoPreBuild/target/ezRecoPreBuild-1.0-SNAPSHOT.jar \
  --class com.np6.scala.EzRecoMRjobs \
  --master local \
  /Users/users/Documents/DATA/data.txt 1 2 /Users/users/Documents/DATA/dossier 

但我得到了这个错误:

Exception in thread "main" java.lang.NumberFormatException: For input string: "/Users/users/Documents/DATA/dossier"
    at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
    at java.lang.Integer.parseInt(Integer.java:569)
    at java.lang.Integer.parseInt(Integer.java:615)
    at scala.collection.immutable.StringLike$class.toInt(StringLike.scala:272)
    at scala.collection.immutable.StringOps.toInt(StringOps.scala:29)
    at com.np6.scala.EzRecoMRjobs$.main(ezRecoMRjobs.scala:51)
    at com.np6.scala.EzRecoMRjobs.main(ezRecoMRjobs.scala)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:775)
    at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180)
    at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205)
    at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:119)
    at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

这个错误的原因是什么?谢谢

string scala apache-spark input submit
1个回答
2
投票

查看文档:https://spark.apache.org/docs/latest/submitting-applications.html

第一个应用程序参数应该是jar文件路径,所以很明显你得到一个NumberFormatException,因为你的代码将最后一个参数(它是一个String)解析为一个数字。

--jars标志用于指定将在您的应用程序中使用的其他jar。

您必须以这种方式运行spark-submit命令:

spark-submit \
  --class com.np6.scala.EzRecoMRjobs \
  --master local[*] \
/Users/users/Documents/TestScala/ezRecoPreBuild/target/ezRecoPreBuild-1.0-SNAPSHOT.jar /Users/users/Documents/DATA/data.txt 1 2 /Users/users/Documents/DATA/dossier 

希望能帮助到你。

© www.soinside.com 2019 - 2024. All rights reserved.