我正在尝试查找触发文件是否在hdfs目录中。
代码:
private static final int index = 23;
@SuppressWarnings("serial")
private static HashMap<String, Boolean> files = new HashMap<String, Boolean>() {{
put("/user/ct_troy/allfiles/_TRIG1", false);
put("/user/ct_troy/allfiles/_TRIG2", false);
put("/user/ct_troy/allfiles/_TRIG3", false);
put("/user/ct_troy/allfiles/_TRIG4", false);
put("/user/ct_troy/allfiles/_TRIG5", false);
}};
private static boolean availableFiles(String file_name){
Configuration config = new Configuration();
config.set("fs.hdfs.impl", org.apache.hadoop.hdfs.DistributedFileSystem.class.getName());
config.set("fs.file.impl", org.apache.hadoop.fs.LocalFileSystem.class.getName());
try {
FileSystem hdfs = FileSystem.get(config);
// Hadoop DFS Path - Input file
Path path = new Path(file_name); // file_name - complete path and file name.
// Check if input is valid
if (hdfs.exists(path) == false) {
System.out.println(file_name + " not found.");
throw new FileNotFoundException(file_name.substring(index));
}
else{
System.out.println(file_name + " File Present.");
return true;
}
}catch (IOException e) {
}
return false;
}
我正在将HashMap<> files
的键作为file_name
参数传递给函数availableFiles
。我建立了一个jar并在节点上运行它,它给了我以下输出:
_TRIG2 not found.
_TRIG3 not found.
_TRIG1 not found.
_TRIG4 not found.
_TRIG5 not found.
[不确定为什么会这样,_TRIG1
,_TRIG2
和_TRIG3
存在,而_TRIG4
和_TRIG5
不存在。对于所有触发文件,它给我相同的结果。帮助。
从官方文档中,您可以通过直接内部调用来检查所需文件是否存在:@see https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/FileSystemShell.html#test
test
Usage: hadoop fs -test -[defswrz] URI
Options:
-d: f the path is a directory, return 0.
-e: if the path exists, return 0.
-f: if the path is a file, return 0.
-s: if the path is not empty, return 0.
-w: if the path exists and write permission is granted, return 0.
-r: if the path exists and read permission is granted, return 0.
-z: if the file is zero length, return 0.
Example:
hadoop fs -test -e filename
您的Java代码看起来不错。也许您的测试不顺利:
if (!hdfs.exists(path)) { // <=================== @see the change and test it.