Spark Scala list folders in directory

前端 未结 9 2423
北恋
北恋 2020-12-05 09:41

I want to list all folders within a hdfs directory using Scala/Spark. In Hadoop I can do this by using the command: hadoop fs -ls hdfs://sandbox.hortonworks.com/demo/<

9条回答
  •  小蘑菇
    小蘑菇 (楼主)
    2020-12-05 10:22

    val spark = SparkSession.builder().appName("Demo").getOrCreate()
    val path = new Path("enter your directory path")
    val fs:FileSystem = projects.getFileSystem(spark.sparkContext.hadoopConfiguration)
    val it = fs.listLocatedStatus(path)
    

    This will create an iterator it over org.apache.hadoop.fs.LocatedFileStatus that is your subdirectory

提交回复
热议问题