How to preserve base data frame rownames upon filtering in dplyr chain

前端 未结 4 1999
梦如初夏
梦如初夏 2020-12-05 02:22

I have the following data frame:


df <- structure(list(BoneMarrow = c(30, 0, 0, 31138, 2703), Pulmonar         


        
相关标签:
4条回答
  • 2020-12-05 02:41

    How about try this by using base R Boolean

    df[rowSums(df>8)==dim(df)[2],] 
    
           BoneMarrow Pulmonary
    ATP1B1         30      3380
    PRR11        2703        27
    

    EDIT1: Or you can do df[!rowSums(df<8),] (as per @user20650) will give back you same result.

    0 讨论(0)
  • 2020-12-05 02:42

    you can convert rownames to a column and revert back after filtering:

    library(dplyr)
    library(tibble)  # for `rownames_to_column` and `column_to_rownames`
    
    df %>%
        rownames_to_column('gene') %>%
        filter_if(is.numeric, all_vars(. >= 8)) %>%
        column_to_rownames('gene')
    
    #        BoneMarrow Pulmonary
    # ATP1B1         30      3380
    # PRR11        2703        27
    
    0 讨论(0)
  • 2020-12-05 02:47

    For gene counts, you often want to know if at least x samples have more than y counts, rather than just across all samples.

    Not as pretty as filter_if, but I'm not sure how you'd implement the same rowSums conditions using all_vars

       x <- sample_threshold  
       y <- count_threshold
    
       require(dplyr) 
       require(tibble)
    
       df %>%  
           tibble::rownames_to_column('gene') %>%  
           dplyr::filter(rowSums(dplyr::select(., -gene) > y) > x) %>%  
           tibble::column_to_rownames('gene')
    
    0 讨论(0)
  • 2020-12-05 02:55

    Here is another base R method with Reduce

    df[Reduce(`&`, lapply(df, `>=`, 8)),]
    #       BoneMarrow Pulmonary
    #ATP1B1         30      3380
    #PRR11        2703        27
    
    0 讨论(0)
提交回复
热议问题