Filter Pyspark dataframe column with None value

前端 未结 10 1699
小鲜肉
小鲜肉 2020-11-29 18:10

I\'m trying to filter a PySpark dataframe that has None as a row value:

df.select(\'dt_mvmt\').distinct().collect()

[Row(dt_mvmt=u\'2016-03-27\         


        
10条回答
  •  失恋的感觉
    2020-11-29 18:28

    None/Null is a data type of the class NoneType in pyspark/python so, Below will not work as you are trying to compare NoneType object with string object

    Wrong way of filreting

    df[df.dt_mvmt == None].count() 0 df[df.dt_mvmt != None].count() 0

    correct

    df=df.where(col("dt_mvmt").isNotNull()) returns all records with dt_mvmt as None/Null

提交回复
热议问题