How to list all files in a directory and its subdirectories in hadoop hdfs

后端 未结 9 1055
故里飘歌
故里飘歌 2020-12-01 05:50

I have a folder in hdfs which has two subfolders each one has about 30 subfolders which,finally,each one contains xml files. I want to list all xml files giving only the mai

9条回答
  •  再見小時候
    2020-12-01 06:09

    Thanks Radu Adrian Moldovan for the suggestion.

    Here is an implementation using queue:

    private static List listAllFilePath(Path hdfsFilePath, FileSystem fs)
    throws FileNotFoundException, IOException {
      List filePathList = new ArrayList();
      Queue fileQueue = new LinkedList();
      fileQueue.add(hdfsFilePath);
      while (!fileQueue.isEmpty()) {
        Path filePath = fileQueue.remove();
        if (fs.isFile(filePath)) {
          filePathList.add(filePath.toString());
        } else {
          FileStatus[] fileStatus = fs.listStatus(filePath);
          for (FileStatus fileStat : fileStatus) {
            fileQueue.add(fileStat.getPath());
          }
        }
      }
      return filePathList;
    }
    

提交回复
热议问题