How to create an empty DataFrame? Why “ValueError: RDD is empty”?

后端 未结 11 1218
孤城傲影
孤城傲影 2021-02-01 03:48

I am trying to create an empty dataframe in Spark (Pyspark).

I am using similar approach to the one discussed here enter link description here, but it is not working.

11条回答
  •  渐次进展
    2021-02-01 04:11

    import pyspark
    from pyspark.sql import SparkSession
    from pyspark.sql.types import StructType,StructField, StringType
    
    spark = SparkSession.builder.appName('SparkPractice').getOrCreate()
    
    schema = StructType([
      StructField('firstname', StringType(), True),
      StructField('middlename', StringType(), True),
      StructField('lastname', StringType(), True)
      ])
    
    df = spark.createDataFrame(spark.sparkContext.emptyRDD(),schema)
    df.printSchema()
    

提交回复
热议问题