Get CSV to Spark dataframe

前端 未结 9 1301
忘了有多久
忘了有多久 2020-12-05 14:45

I\'m using python on Spark and would like to get a csv into a dataframe.

The documentation for Spark SQL strangely does not provide explanations for CSV as a source.

9条回答
  •  青春惊慌失措
    2020-12-05 15:13

    With more recent versions of Spark (as of, I believe, 1.4) this has become a lot easier. The expression sqlContext.read gives you a DataFrameReader instance, with a .csv() method:

    df = sqlContext.read.csv("/path/to/your.csv")
    

    Note that you can also indicate that the csv file has a header by adding the keyword argument header=True to the .csv() call. A handful of other options are available, and described in the link above.

提交回复
热议问题