How to get column names with all values null?

后端 未结 2 1127
再見小時候
再見小時候 2020-12-10 22:34

I don\'t have any ideas to get column names when it has null value

For example,

case class A(name: String, id: String, email: String, company: String         


        
2条回答
  •  天命终不由人
    2020-12-10 22:58

    You can do a simple count on all your columns, then using the indices of the columns that return a count of 0, you subset df.columns:

    import org.apache.spark.sql.functions.{count,col}
    // Get column indices
    val col_inds = df.select(df.columns.map(c => count(col(c)).alias(c)): _*)
                     .collect()(0)
                     .toSeq.zipWithIndex
                     .filter(_._1 == 0).map(_._2)
    // Subset column names using the indices
    col_inds.map(i => df.columns.apply(i))
    //Seq[String] = ArrayBuffer(id, company)
    

提交回复
热议问题