如何使用Java查找文件是否在hdfs中?

问题描述 投票:0回答:1

我正在尝试查找触发文件是否在hdfs目录中。

代码:

    private static final int index = 23;
    @SuppressWarnings("serial")
    private static HashMap<String, Boolean> files = new HashMap<String, Boolean>() {{
        put("/user/ct_troy/allfiles/_TRIG1", false);
        put("/user/ct_troy/allfiles/_TRIG2", false);
        put("/user/ct_troy/allfiles/_TRIG3", false);
        put("/user/ct_troy/allfiles/_TRIG4", false);
        put("/user/ct_troy/allfiles/_TRIG5", false);
    }};

    private static boolean availableFiles(String file_name){
        Configuration config = new Configuration();
        config.set("fs.hdfs.impl", org.apache.hadoop.hdfs.DistributedFileSystem.class.getName());
        config.set("fs.file.impl", org.apache.hadoop.fs.LocalFileSystem.class.getName());
        try {
            FileSystem hdfs = FileSystem.get(config);
            // Hadoop DFS Path - Input file
            Path path = new Path(file_name); // file_name - complete path and file name.
            // Check if input is valid
            if (hdfs.exists(path) == false) {
                System.out.println(file_name + " not found.");
                throw new FileNotFoundException(file_name.substring(index));
            }
            else{
                    System.out.println(file_name + " File Present.");
                    return true;
                }
            }catch (IOException e) {
            }
        return false;
    }

我正在将HashMap<> files的键作为file_name参数传递给函数availableFiles。我建立了一个jar并在节点上运行它,它给了我以下输出:

_TRIG2 not found.
_TRIG3 not found.
_TRIG1 not found.
_TRIG4 not found.
_TRIG5 not found.

[不确定为什么会这样,_TRIG1_TRIG2_TRIG3存在,而_TRIG4_TRIG5不存在。对于所有触发文件,它给我相同的结果。帮助。

java maven hdfs hadoop2
1个回答
0
投票

从官方文档中,您可以通过直接内部调用来检查所需文件是否存在:@see https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/FileSystemShell.html#test

test

Usage: hadoop fs -test -[defswrz] URI

Options:

    -d: f the path is a directory, return 0.
    -e: if the path exists, return 0.
    -f: if the path is a file, return 0.
    -s: if the path is not empty, return 0.
    -w: if the path exists and write permission is granted, return 0.
    -r: if the path exists and read permission is granted, return 0.
    -z: if the file is zero length, return 0.

Example:

    hadoop fs -test -e filename

您的Java代码看起来不错。也许您的测试不顺利:

if (!hdfs.exists(path)) { // <=================== @see the change and test it.
© www.soinside.com 2019 - 2024. All rights reserved.