报告有关未找到源“mongodb”的错误,尽管 sbt 程序集 jar 具有 mongo-spark-connector v10

问题描述 投票:0回答:1

我想使用spark dataframe连接到mongodb,并制作镶木地板文件。我在 sbt 文件中配置 mongo-spark-connector v10.2.2,它在本地工作。然而,在生产中,spark 应用程序会报告

java.lang.ClassNotFoundException: Failed to find data source: mongodb. Please find packages at http://spark.apache.org/third-party-projects.html
。 (所有外部库都将构建到 assembly.jar 中)

我检查了 assembly.jar,sbt 确实下载了连接器 jar。有什么我想念的吗?

[添加。信息]

这是 Spark 版本:3.1.3 和 mongo-spark-connector v10.2.2
Spark应用程序使用客户端模式。

我手动下载了 mongo-spark-connector.jar 并使用了像

java -cp etc::/opt/app/lib/mongo-spark-connector_2.12-10.2.2.jar:/opt/app/lib/assembly-1.0.jar
这样的系统属性命令,它可以工作。

scala apache-spark jvm sbt sbt-assembly
1个回答
0
投票

@Gaël J

这是我的 build.sbt 文件。

name := "app"

version := "1.0"

scalaVersion in ThisBuild := "2.12.10"
lazy val sparkVersion = "3.1.3"

libraryDependencies ++= Seq(
  "org.apache.spark" % "spark-core_2.12" % sparkVersion,
  "org.apache.spark" % "spark-sql_2.12" % sparkVersion,
  "org.apache.spark" % "spark-avro_2.12" % sparkVersion,
  "org.apache.parquet" % "parquet-avro" % "1.12.2",
  "org.mongodb.spark" %% "mongo-spark-connector" % "10.2.2"
)

assemblyMergeStrategy in assembly := {
  case PathList("META-INF", "MANIFEST.MF") => MergeStrategy.discard
  case x if x.startsWith("META-INF") && x.endsWith(".SF") => MergeStrategy.discard
  case x if x.startsWith("META-INF") && x.endsWith(".RSA") => MergeStrategy.discard
  case x if x.startsWith("META-INF") && x.endsWith(".DSA") => MergeStrategy.discard
  case x if x.startsWith("META-INF") && x.endsWith(".TXT") => MergeStrategy.discard
  case PathList(ps@_*) if ps.last endsWith ".conf" => MergeStrategy.concat
  case x => MergeStrategy.first
}

这是启动应用程序的 shell 脚本。

#!/bin/bash

CLASSPATH="";

for jar_file in /opt/app/lib/*.jar; do
    if [[ $jar_file != /opt/app/lib/app*.jar ]]; then
        CLASSPATH="$CLASSPATH:$jar_file";
    fi
done;

for jar_file in /opt/app/lib/app*.jar; do
    CLASSPATH="$CLASSPATH:$jar_file";
done;

CLASSPATH="etc:$CLASSPATH";

JVM_OPTS="$JVM_OPTS -Dfile.encoding=UTF-8";
JVM_OPTS="$JVM_OPTS -verbose:gc";
JVM_OPTS="$JVM_OPTS -XX:+UseG1GC"
JVM_OPTS="$JVM_OPTS -XX:MaxGCPauseMillis=20"
JVM_OPTS="$JVM_OPTS -XX:+PrintGCDateStamps";
JVM_OPTS="$JVM_OPTS -XX:+PrintHeapAtGC";
JVM_OPTS="$JVM_OPTS -XX:+PrintGCDetails";
JVM_OPTS="$JVM_OPTS -XX:+UseGCLogFileRotation";
JVM_OPTS="$JVM_OPTS -XX:NumberOfGCLogFiles=1";
JVM_OPTS="$JVM_OPTS -XX:GCLogFileSize=1M";
JVM_OPTS="$JVM_OPTS -XX:+AlwaysPreTouch";
JVM_OPTS="$JVM_OPTS -XX:+UseCompressedOops";
JVM_OPTS="$JVM_OPTS -XX:+HeapDumpOnOutOfMemoryError";

MAIN_CLASS="App"

nohup java -classpath $CLASSPATH $JVM_OPTS $MAIN_CLASS "$@" > "$log" 2>&1 < /dev/null &
© www.soinside.com 2019 - 2024. All rights reserved.