How to delete columns in pyspark dataframe

前端 未结 8 1616
滥情空心
滥情空心 2021-01-30 01:55
>>> a
DataFrame[id: bigint, julian_date: string, user_id: bigint]
>>> b
DataFrame[id: bigint, quan_created_money: decimal(10,0), quan_created_cnt: bigi         


        
8条回答
  •  灰色年华
    2021-01-30 02:51

    Maybe a little bit off topic, but here is the solution using Scala. Make an Array of column names from your oldDataFrame and delete the columns that you want to drop ("colExclude"). Then pass the Array[Column] to select and unpack it.

    val columnsToKeep: Array[Column] = oldDataFrame.columns.diff(Array("colExclude"))
                                                   .map(x => oldDataFrame.col(x))
    val newDataFrame: DataFrame = oldDataFrame.select(columnsToKeep: _*)
    

提交回复
热议问题