Get CSV to Spark dataframe

前端 未结 9 1291
忘了有多久
忘了有多久 2020-12-05 14:45

I\'m using python on Spark and would like to get a csv into a dataframe.

The documentation for Spark SQL strangely does not provide explanations for CSV as a source.

9条回答
  •  感情败类
    2020-12-05 15:15

    from pyspark.sql.types import StringType
    from pyspark import SQLContext
    sqlContext = SQLContext(sc)
    
    Employee_rdd = sc.textFile("\..\Employee.csv")
                   .map(lambda line: line.split(","))
    
    Employee_df = Employee_rdd.toDF(['Employee_ID','Employee_name'])
    
    Employee_df.show()
    

提交回复
热议问题