I have a data.frame with several rows which come from a merge which are not completely merged:
b <- read.table(text = \"
ID Age Steatosis
A dplyr approach using summarise_all:
## using `na.strings` to identify NA entries in posted data
b <- read.table(text = "
ID Age Steatosis Mallory Lille_dico Lille_3 Bili.AHHS2cat
68 HA-09 16 5 NA
69 HA-09 16 <33% no/occasional NA 1", na.strings = c("NA", ""))
library(dplyr)
f <- function(x) {
x <- na.omit(x)
if (length(x) > 0) first(x) else NA
}
res <- b %>% group_by(ID,Age) %>% summarise_all(funs(f))
##Source: local data frame [1 x 7]
##Groups: ID [?]
##
## ID Age Steatosis Mallory Lille_dico Lille_3 Bili.AHHS2cat
##
##1 HA-09 16 <33% no/occasional NA 5 1
The definition of the function is to handle the case where all values is NA.
As @jdobres suggests, if there are more than one non-NA values that you want to merge (per each column), you may want to flatten all of these to a string representation using:
library(dplyr)
f <- function(x) {
x <- na.omit(x)
if (length(x) > 0) paste(x,collapse='-') else NA
}
res <- b %>% group_by(ID,Age) %>% summarise_all(funs(f))
In your posted data, the result would be the same as above because all columns that are summarized has at most one non-NA value.