我如何配置spark-submit(或DataProc)从GitHub包中下载Maven依赖项(jar)?

问题描述 投票:1回答:1

我正在尝试通过GCP DataProc获得spark-submit以从GitHub软件包存储库下载maven依赖项。

spark.jars.repositories=https://myuser:[email protected]/myorg/my-maven-packages-repo/命令添加spark-submit无助...

已访问正确的URL,但未下载文件(https://maven.pkg.github.com/myorg/my-maven-packages.repo/myorg/mylibrary/1.0.0/library-1.0.0.jar)。

我如何使它正常工作? (不使用uber-jars!)

apache-spark github ivy google-cloud-dataproc spark-submit
1个回答
0
投票

而不是添​​加:

[spark.jars.repositories=https://myuser:[email protected]/myorg/my-maven-packages-repo/spark-submit命令]

添加:

--files=gs://my-bucket/github-ivysettings.xml

spark.jars.ivySettings=github-ivysettings.xml

将以下文件(github-ivysettings.xml)上载到存储桶:

<ivysettings>  

  <settings defaultResolver="default"/>

  <include url="${ivy.default.settings.dir}/ivysettings-public.xml"/>
  <include url="${ivy.default.settings.dir}/ivysettings-shared.xml"/>
  <include url="${ivy.default.settings.dir}/ivysettings-local.xml"/>
  <include url="${ivy.default.settings.dir}/ivysettings-main-chain.xml"/>
  <include url="${ivy.default.settings.dir}/ivysettings-default-chain.xml"/>

  <credentials
      host="maven.pkg.github.com" realm="GitHub Package Registry"
      username="myuser" passwd="mytoken"
      />
  <resolvers>
      <ibiblio
          name="private-github"
          m2compatible="true" useMavenMetadata="true" usepoms="true"
          root="https://maven.pkg.github.com/myorg/my-maven-packages-repo/"
          pattern="[organisation]/[module]/[revision]/[artifact]-[revision](-[classifier]).[ext]"
      />
    <chain name="default" returnFirst="true" checkmodified="true">
      <resolver ref="local" />
      <resolver ref="shared" />
      <resolver ref="public" />
      <resolver ref="private-github" />
    </chain>
  </resolvers>
</ivysettings>

这将保留当前的搜索顺序(本地,共享,公共),然后在您的私有存储库中搜索。

请注意,领域很重要,因此,如果将其用于其他私有存储库,请更改主机,根目录和领域。

© www.soinside.com 2019 - 2024. All rights reserved.