从保存点恢复 Flink 作业时出现 NoSuchMethodError

问题描述 投票:0回答:1

我正在使用版本为

1.16.1
的 Apache Flink。当我使用
fromSavepoint
选项部署作业时,出现以下错误;

"java.lang.NoSuchMethodError: org.apache.commons.cli.CommandLine.hasOption(Lorg/apache/commons/cli/Option;)Z
    at org.apache.flink.client.cli.CliFrontendParser.createSavepointRestoreSettings(CliFrontendParser.java:631) ~[flink-dist-1.16.1.jar:1.16.1]
    at org.apache.flink.client.cli.ProgramOptions.<init>(ProgramOptions.java:119) ~[flink-dist-1.16.1.jar:1.16.1]
    at org.apache.flink.client.cli.ProgramOptions.create(ProgramOptions.java:192) ~[flink-dist-1.16.1.jar:1.16.1]
    at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:230) ~[flink-dist-1.16.1.jar:1.16.1]
    at org.apache.flink.client.cli.CliFrontend.parseAndRun(CliFrontend.java:1087) ~[flink-dist-1.16.1.jar:1.16.1]
    at org.apache.flink.client.cli.CliFrontend.lambda$main$10(CliFrontend.java:1165) ~[flink-dist-1.16.1.jar:1.16.1]
    at java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_372]
    at javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_372]
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1836) ~[flink-shaded-hadoop-2-uber-2.8.3-10.0.jar:2.8.3-10.0]
    at org.apache.flink.runtime.security.contexts.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41) ~[flink-dist-1.16.1.jar:1.16.1]
    at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1165) [flink-dist-1.16.1.jar:1.16.1]"

这似乎是

commons-cli
的版本不匹配问题。版本
commons-cli
至少必须是
1.5.0
才能部署作业,但它无法做到这一点。我检查了maven文件中的每个包,没有下面的包
1.5.0
。这里有一件有趣的事情;

这是上面那个东西的源代码;

    public static SavepointRestoreSettings createSavepointRestoreSettings(CommandLine commandLine) {
        if (commandLine.hasOption(SAVEPOINT_PATH_OPTION.getOpt())) {
            String savepointPath = commandLine.getOptionValue(SAVEPOINT_PATH_OPTION.getOpt());
            boolean allowNonRestoredState =
                    commandLine.hasOption(SAVEPOINT_ALLOW_NON_RESTORED_OPTION.getOpt());
            final RestoreMode restoreMode;
            if (commandLine.hasOption(SAVEPOINT_RESTORE_MODE)) {
                restoreMode =
                        ConfigurationUtils.convertValue(
                                commandLine.getOptionValue(SAVEPOINT_RESTORE_MODE),
                                RestoreMode.class);
            } else {
                restoreMode = SavepointConfigOptions.RESTORE_MODE.defaultValue();
            }
            return SavepointRestoreSettings.forPath(
                    savepointPath, allowNonRestoredState, restoreMode);
        } else {
            return SavepointRestoreSettings.none();
        }
    }

作业在

java:631 -> if (commandLine.hasOption(SAVEPOINT_RESTORE_MODE))
失败。它在这里抛出异常,但上面也存在类似的控件(上面 5 行)
java:626 if (commandLine.hasOption(SAVEPOINT_PATH_OPTION.getOpt()))
。正如您在这里所看到的,第二个没有
.getOpt
方法,这可能会导致这里出现问题。我该怎么做才能解决这个问题?

编辑: 为了确保代码没有问题,我运行了下面的代码;

    val opt = new Option("rm", "restoreMode", true, "Defines how should we restore from the given savepoint. Supported options: " + "[claim - claim ownership of the savepoint and delete once it is" + " subsumed, no_claim (default) - do not claim ownership, the first" + " checkpoint will not reuse any files from the restored one, legacy " + "- the old behaviour, do not assume ownership of the savepoint files," + " but can reuse some shared files.")

    val cl = new CommandLine.Builder().build()

    if (cl.hasOption(opt.getOpt)) {
     logger.error("l!")
    } else {
      logger.error("p!")
    }

    if (cl.hasOption(opt)) {
      logger.error("y!")
    }
    else {
      logger.error("x!")
    }

它没有抛出任何异常。这里看起来好像版本等都没有问题,但是为什么同样的情况Flink会抛出异常呢?

apache-flink flink-streaming
1个回答
0
投票

正如您所指出的,通常此错误表明类路径上存在不兼容的 jar 版本(在您的情况下为旧版本)。仅仅检查 Maven 依赖关系是不够的,因为 Flink 的类路径是动态构建的,并且非常复杂。我假设您已经阅读了 调试类加载 页面,因此尝试了

classloader.resolve-order
配置巫术。

我有时不得不使用 OpenSearch 项目中的 JarHell 代码来调试这些问题。通常,在最终找到问题的根源之前,您必须对其进行编辑以消除 Flink 中不可避免的冲突。

© www.soinside.com 2019 - 2024. All rights reserved.