How to select rows according to column value conditions

后端 未结 4 1549
心在旅途
心在旅途 2020-12-06 22:10

I have a data set which looks like the following (partially):

id  name    dummy
1   Jane    1
1   Jane    0
1   Jane    1
2   Mike    0
2   Mike    0
2   Mik         


        
相关标签:
4条回答
  • 2020-12-06 22:42

    ave can be used to produce the groupwise sum, while keeping the original position:

    x[with(x, ave(dummy, name, FUN=sum))>0,]
    ##    id name dummy
    ## 1   1 Jane     1
    ## 2   1 Jane     0
    ## 3   1 Jane     1
    ## 9   3  Tom     1
    ## 10  3  Tom     1
    ## 11  3  Tom     0
    ## 12  3  Tom     0
    

    ave is something like aggregate but copies the aggregated value for each row in the set:

    with(x, ave(dummy, name, FUN=sum))
    ## [1] 2 2 2 0 0 0 0 0 2 2 2 2
    
    0 讨论(0)
  • 2020-12-06 22:53

    You can use plyr::ddply too

    require(plyr)
    ddply(df, .(name), function(x) subset(x, !all(dummy == 0)))
    ##   id name dummy
    ## 1  1 Jane     1
    ## 2  1 Jane     0
    ## 3  1 Jane     1
    ## 4  3  Tom     1
    ## 5  3  Tom     1
    ## 6  3  Tom     0
    ## 7  3  Tom     0
    

    Note that it's possible to replace !all(dummy == 0) by any(dummy != 0)

    0 讨论(0)
  • 2020-12-06 23:01

    Consider df is your data.frame, then use tapply and [ to subset what you want:

    > ind <- with(df, tapply(dummy, name, sum))
    > df[df$name %in% names(ind)[ind!=0], ]
       id name dummy
    1   1 Jane     1
    2   1 Jane     0
    3   1 Jane     1
    9   3  Tom     1
    10  3  Tom     1
    11  3  Tom     0
    12  3  Tom     0
    

    Another alternative:

    > result <- split(df, df$name)[with(df, tapply(dummy, name, function(x) sum(x)!=0))]
    > do.call(rbind, result)
    
    0 讨论(0)
  • 2020-12-06 23:03

    A possible solution:

    subset(dat, as.logical(ave(dummy, id, FUN = any)))
    
    #    id name dummy
    # 1   1 Jane     1
    # 2   1 Jane     0
    # 3   1 Jane     1
    # 9   3  Tom     1
    # 10  3  Tom     1
    # 11  3  Tom     0
    # 12  3  Tom     0
    

    An alternative with data.table:

    library(data.table)
    setDT(dat)[, if (any(dummy)) .SD, by = id]
    

    Or with dplyr:

    library(dplyr)
    dat %>% 
      group_by(id) %>% 
      filter(any(dummy))
    
    0 讨论(0)
提交回复
热议问题