I\'m using python on Spark and would like to get a csv into a dataframe.
The documentation for Spark SQL strangely does not provide explanations for CSV as a source.
from pyspark.sql.types import StringType from pyspark import SQLContext sqlContext = SQLContext(sc) Employee_rdd = sc.textFile("\..\Employee.csv") .map(lambda line: line.split(",")) Employee_df = Employee_rdd.toDF(['Employee_ID','Employee_name']) Employee_df.show()