Merging rows with shared information

前端 未结 4 1206
故里飘歌
故里飘歌 2021-01-07 04:18

I have a data.frame with several rows which come from a merge which are not completely merged:

b <- read.table(text = \"
      ID   Age    Steatosis               


        
4条回答
  •  灰色年华
    2021-01-07 04:27

    A dplyr approach using summarise_all:

    ## using `na.strings` to identify NA entries in posted data
    b <- read.table(text = "
          ID   Age    Steatosis       Mallory Lille_dico Lille_3 Bili.AHHS2cat
    68 HA-09   16                           5             NA
    69 HA-09   16   <33% no/occasional             NA             1", na.strings = c("NA", ""))
    
    library(dplyr)
    f <- function(x) {
      x <- na.omit(x)
      if (length(x) > 0) first(x) else NA
    }
    res <- b %>% group_by(ID,Age) %>% summarise_all(funs(f))
    ##Source: local data frame [1 x 7]
    ##Groups: ID [?]
    ##
    ##      ID   Age Steatosis       Mallory Lille_dico Lille_3 Bili.AHHS2cat
    ##                                 
    ##1  HA-09    16      <33% no/occasional         NA       5             1
    

    The definition of the function is to handle the case where all values is NA.


    As @jdobres suggests, if there are more than one non-NA values that you want to merge (per each column), you may want to flatten all of these to a string representation using:

    library(dplyr)
    f <- function(x) {
      x <- na.omit(x)
      if (length(x) > 0) paste(x,collapse='-') else NA
    }
    res <- b %>% group_by(ID,Age) %>% summarise_all(funs(f))
    

    In your posted data, the result would be the same as above because all columns that are summarized has at most one non-NA value.

提交回复
热议问题