Read all files in a nested folder in Spark

前端 未结 4 1439
无人共我
无人共我 2020-12-30 03:15

If we have a folder folder having all .txt files, we can read them all using sc.textFile(\"folder/*.txt\"). But what if I have a folde

4条回答
  •  情话喂你
    2020-12-30 03:32

    Spark 3.0 provides an option recursiveFileLookup to load files from recursive subfolders.

    val df= sparkSession.read
           .option("recursiveFileLookup","true")
          .option("header","true")
          .csv("src/main/resources/nested")
    

    This recursively loads the files from src/main/resources/nested and it's subfolders.

提交回复
热议问题