I\'m just wondering what is the difference between an RDD and DataFrame (Spark 2.0.0 DataFrame is a mere type alias for Dataset[Row]
RDD
DataFrame
Dataset[Row]
You can use RDD's with Structured and unstructured where as Dataframe/Dataset can only process Structured and Semi Structured Data (It is having proper schema)