How to select only the last row among the subset of rows satisfying a condition in R programming

后端 未结 3 1193
被撕碎了的回忆
被撕碎了的回忆 2021-01-28 17:57

The dataframe looks like this :

Customer_id A B C D E F G
10000001    1 1 2 3 1 3 1
10000001    1 2 3 1 2 1 3
10000002    2 2 2 3 1 3 1
10000002    2 2 1 4 2 3 1         


        
相关标签:
3条回答
  • 2021-01-28 18:26

    Something like this (hard to code without the data in R format):

    dataframe[ rev(!duplicated(rev(dataframe$Customer_id))),]
    

    or better

    dataframe[ !duplicated(dataframe$Customer_id,fromLast=TRUE),]
    
    0 讨论(0)
  • 2021-01-28 18:37

    You can also use aggregate

    aggregate(. ~ Customer_id, data = DF, FUN =  tail, 1)
    ##   Customer_id A B C D E F G
    ## 1    10000001 1 2 3 1 2 1 3
    ## 2    10000002 2 2 1 4 2 3 1
    ## 3    10000003 1 1 2 2 1 2 1
    ## 4    10000004 1 4 1 4 1 3 1
    ## 5    10000006 1 3 1 4 1 2 1
    ## 6    10000008 1 3 1 1 2 2 1
    
    0 讨论(0)
  • 2021-01-28 18:38

    Assume your data is named dat,

    Here's one way using by and rbind, although the other two methods (aggregate and duplicated) are much nicer:

    > do.call(rbind, by(dat,dat$Customer_id,FUN=tail,1))
    ##    Customer_id A B C D E F G
    ## 2     10000001 1 2 3 1 2 1 3
    ## 4     10000002 2 2 1 4 2 3 1
    ## 7     10000003 1 1 2 2 1 2 1
    ## 11    10000004 1 4 1 4 1 3 1
    ## 13    10000006 1 3 1 4 1 2 1
    ## 16    10000008 1 3 1 1 2 2 1
    
    0 讨论(0)
提交回复
热议问题