在Kerberos下为Kafka启动Spark-Submit作业

问题描述 投票:1回答:1

通过修补程序,我已经可以使用以下命令部分启动spark提交作业,但是启动后不久它崩溃了,并给了我下面概述的异常:

Spark-Submit命令:

su spark -c 'export SPARK_MAJOR_VERSION=2; spark-submit \ --verbose \ --master yarn \ --driver-cores 5 \ --num-executors 3 --executor-cores 6 \ --principal [email protected] \ --keytab /etc/security/keytabs/spark.headless.keytab \ --driver-java-options "-Djava.security.auth.login.config=kafka_client_jaas.conf"\ --conf "spark.executor.extraJavaOptions=-Djava.security.auth.login.config=kafka_client_jaas.conf" \ --files "/tmp/kafka_client_jaas.conf,/tmp/kafka.service.keytab" \ --class au.com.XXX.XXX.spark.test.test test.jar application.properties'

EXCEPTION:

Caused by: org.apache.kafka.common.KafkaException: javax.security.auth.login.LoginException: Could not login: the client is being asked for a password, but the Kafka client code does not currently support obtaining a password from the user. not available to garner  authentication information from the user

WARN KerberosLogin: [Principal=kafka/[email protected]]: TGT renewal thread has been interrupted and will exit.

如何获得Kerberos来同时实现KINIT的两个主体?我假设这是这里的问题吗?我曾尝试在初始命令中添加另一组--principal /-keytab,尽管这在HDFS中带来了更多权限问题。

hadoop apache-spark apache-kafka kerberos hortonworks-data-platform
1个回答
0
投票

这是一个老话题,但我为此苦了一段时间,希望能对某人有所帮助。

可能的原因是Spark执行程序无法找到密钥表,因此他们无法通过Kerberos身份验证。在提交时,您应该使用以下选项将Jaas配置和Keytab文件传递给执行者:

spark-submit --master yarn --deploy-mode cluster --files /path/to/keytab/yourkeytab.keytab#yourkeytab,/path/to/jaas/your-kafka-jaas.conf#your-kafka-jaas.conf --conf "spark.driver.extraJavaOptions=-Djava.security.auth.login.config=your-kafka-jaas.conf" --conf "spark.executor.extraJavaOptions=-Djava.security.auth.login.config=your-kafka-jaas.conf" --driver-java-options "-Djava.security.auth.login.config=your-kafka-jaas.conf" your-application.jar

最后,由于这些jaas文件将发送给执行程序(和spark驱动程序),因此,您应该使用Keytab的相对路径,而不是绝对路径。然后,您的jaas配置应包含以下行:

keyTab="./yourkeytab.keytab"
© www.soinside.com 2019 - 2024. All rights reserved.