Filtering rows based on column values in spark dataframe scala

前端 未结 4 720
时光说笑
时光说笑 2020-12-09 21:16

I have a dataframe(spark):

id  value 
3     0
3     1
3     0
4     1
4     0
4     0

I want to create a new dataframe:

3 0         


        
4条回答
  •  天命终不由人
    2020-12-09 21:43

    You can simply use groupBy like this

    val df2 = df1.groupBy("id","value").count().select("id","value")
    

    Here your df1 is

    id  value 
    3     0
    3     1
    3     0
    4     1
    4     0
    4     0
    

    And resultant dataframe is df2 which is your expected output like this

    id  value 
    3     0
    3     1
    4     1
    4     0
    

提交回复
热议问题