Where does spark look for text files?

后端 未结 2 1439
执念已碎
执念已碎 2021-02-08 09:29

I thought that loading text files is done only from workers / within the cluster (you just need to make sure all workers have access to the same path, either by having that text

2条回答
  •  不要未来只要你来
    2021-02-08 09:58

    Spark can look for files both locally or on HDFS.

    If you'd like to read in a file using sc.textFile() and take advantage of its RDD format, then the file should sit on HDFS. If you just want to read in a file the normal way, it is the same as you do depending on the API (Scala, Java, Python).

    If you submit a local file with your driver, then addFile() distributes the file to each node and SparkFiles.get() downloads the file to a local temporary file.

提交回复
热议问题