Multiple condition filter on dataframe

后端 未结 2 2011
栀梦
栀梦 2020-12-09 09:39

Can anyone explain to me why I am getting different results for these 2 expressions ? I am trying to filter between 2 dates:

df.filter(\"act_date <=\'2017         


        
2条回答
  •  我在风中等你
    2020-12-09 10:16

    In first case

    df.filter("act_date <='2017-04-01'" and "act_date >='2016-10-01'")\
      .select("col1","col2").distinct().count()
    

    the result is values more than 2016-10-01 that means all the values above 2017-04-01 also.

    Whereas in second case

    df.filter("act_date <='2017-04-01'").filter("act_date >='2016-10-01'")\
      .select("col1","col2").distinct().count()
    

    the result is the values between 2016-10-01 to 2017-04-01.

提交回复
热议问题