dplyr filter with condition on multiple columns

后端 未结 4 1684
粉色の甜心
粉色の甜心 2020-12-02 23:56

Here\'s a dummy data :

father<- c(1, 1, 1, 1, 1)
mother<- c(1, 1, 1, NA, NA) 
children <- c(NA, NA, 2, 5, 2) 
cousins   <- c(NA, 5, 1, 1, 4) 


d         


        
4条回答
  •  爱一瞬间的悲伤
    2020-12-03 00:30

    A possible dplyr(0.5.0.9004 <= version < 1.0) solution is:

    # > packageVersion('dplyr')
    # [1] ‘0.5.0.9004’
    
    dataset %>%
        filter(!is.na(father), !is.na(mother)) %>%
        filter_at(vars(-father, -mother), all_vars(is.na(.)))
    

    Explanation:

    • vars(-father, -mother): select all columns except father and mother.
    • all_vars(is.na(.)): keep rows where is.na is TRUE for all the selected columns.

    note: any_vars should be used instead of all_vars if rows where is.na is TRUE for any column are to be kept.


    Update (2020-11-28)

    Since the _at functions and vars have been superseded by the use of across since dplyr 1.0, the following way (or similar) is recommended now:

    dataset %>%
        filter(across(c(father, mother), ~ !is.na(.x))) %>%
        filter(across(c(-father, -mother), is.na))
    

    See more example of across and how to rewrite previous code with the new approach here: Colomn-wise operatons or type vignette("colwise") in R after installing the latest version of dplyr.

提交回复
热议问题