How to convert Spark RDD to pandas dataframe in ipython?
I have a RDD and I want to convert it to pandas dataframe . I know that to convert and RDD to a normal dataframe we can do df = rdd1.toDF() But I want to convert the RDD to pandas dataframe and not a normal dataframe . How can I do it? You can use function toPandas() : Returns the contents of this DataFrame as Pandas pandas.DataFrame. This is only available if Pandas is installed and available. >>> df.toPandas() age name 0 2 Alice 1 5 Bob RKD314 You'll have to use a Spark DataFrame as an intermediary step between your RDD and the desired Pandas DataFrame. For example, let's say I have a text