我使用的是Spark版本2.2.1和Carbondata 1.5.3版本。按照Carbondata official guide中的说明,我可以运行import语句,导入org.apache.spark.sql.SparkSession导入org.apache.spark.sql.CarbonSession ._
但是通过以下日志无法进行下一步:
scala> val carbon = SparkSession.builder().config(sc.getConf).getOrCreateCarbonSession("<carbon_store_path>")
19/05/08 12:14:17 WARN SparkContext: Using an existing SparkContext; some configuration may not take effect.
19/05/08 12:14:17 WARN CarbonProperties: The enable unsafe sort value "null" is invalid. Using the default value "true
19/05/08 12:14:17 WARN CarbonProperties: The enable off heap sort value "null" is invalid. Using the default value "true
19/05/08 12:14:17 WARN CarbonProperties: The custom block distribution value "null" is invalid. Using the default value "false
19/05/08 12:14:17 WARN CarbonProperties: The enable vector reader value "null" is invalid. Using the default value "true
19/05/08 12:14:17 WARN CarbonProperties: The carbon task distribution value "null" is invalid. Using the default value "block
19/05/08 12:14:17 WARN CarbonProperties: The enable auto handoff value "null" is invalid. Using the default value "true
19/05/08 12:14:17 WARN CarbonProperties: The specified value for property 512is invalid.
19/05/08 12:14:17 WARN CarbonProperties: The specified value for property carbon.sort.storage.inmemory.size.inmbis invalid. Taking the default value.512
java.lang.ClassNotFoundException: org.apache.spark.sql.hive.CarbonSessionStateBuilder
at scala.reflect.internal.util.AbstractFileClassLoader.findClass(AbstractFileClassLoader.scala:62)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:348)
at org.apache.spark.util.Utils$.classForName(Utils.scala:230)
at org.apache.spark.util.CarbonReflectionUtils$.createObject(CarbonReflectionUtils.scala:324)
at org.apache.spark.util.CarbonReflectionUtils$.getSessionState(CarbonReflectionUtils.scala:220)
at org.apache.spark.sql.CarbonSession.sessionState$lzycompute(CarbonSession.scala:57)
at org.apache.spark.sql.CarbonSession.sessionState(CarbonSession.scala:56)
at org.apache.spark.sql.CarbonSession$CarbonBuilder$$anonfun$getOrCreateCarbonSession$2.apply(CarbonSession.scala:260)
at org.apache.spark.sql.CarbonSession$CarbonBuilder$$anonfun$getOrCreateCarbonSession$2.apply(CarbonSession.scala:260)
at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:99)
at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:99)
at scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:230)
at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:40)
at scala.collection.mutable.HashMap.foreach(HashMap.scala:99)
at org.apache.spark.sql.CarbonSession$CarbonBuilder.getOrCreateCarbonSession(CarbonSession.scala:260)
at org.apache.spark.sql.CarbonSession$CarbonBuilder.getOrCreateCarbonSession(CarbonSession.scala:169)
... 50 elided
我的JAVA_HOME已设置,否则spark无法运行,我可以在没有CarbonData的情况下运行Spark应用程序。尝试创建Carbonsession时失败。 SPARK_HOME指向Spark的bin目录。
我在本地计算机上运行Spark并使用本地文件系统进行存储,但没有配置单元。
感谢您的帮助。如果需要其他详细信息,请告诉我。
请确保您的carbondata jar适合您的Spark版本。一开始我有那个错误