PySpark java.io.IOException: No FileSystem for scheme: https

前端 未结 3 1288
情书的邮戳
情书的邮戳 2021-01-19 06:04

I am using local windows and trying to load the XML file with the following code on python, and i am having this error, do anyone knows how to resolve it,

3条回答
  •  陌清茗
    陌清茗 (楼主)
    2021-01-19 06:55

    The error message says it all: you cannot use dataframe reader & load to access files on the web (http or htpps). I suggest you first download the file locally.

    See the pyspark.sql.DataFrameReader docs for more on the available sources (in general, local file system, HDFS, and databases via JDBC).

    Irrelevantly to the error, notice that you seem to use the format part of the command incorrectly: assuming that you use the XML Data Source for Apache Spark package, the correct usage should be format('com.databricks.spark.xml') (see the example).

提交回复
热议问题