join matching columns in a data.frame or data.table

前端 未结 2 1449
难免孤独
难免孤独 2020-12-16 03:14

I have the following data.frames:

a <- data.frame(id = 1:3, v1 = c(\'a\', NA, NA), v2 = c(NA, \'b\', \'c\'))
b <- data.frame(id = 1:3, v1 = c(NA, \'B         


        
相关标签:
2条回答
  • 2020-12-16 03:50

    The type of merge you specify probably won't be possible using merge (with data frames), although saying that usually invites being proved wrong.

    You also omit some details: will there always be a single unique non-NA value in each column for each id value? If so, this will work:

    ab <- rbind(a,b)
    > colFun <- function(x){x[which(!is.na(x))]}
    > ddply(ab,.(id),function(x){colwise(colFun)(x)})
      id v1 v2
    1  1  a  A
    2  2  B  b
    3  3  C  c
    

    A similar strategy should work with data.tables as well:

    abDT <- data.table(ab,key = "id")
    > abDT[,list(colFun(v1),colFun(v2)),by = id]
         id V1 V2
    [1,]  1  a  A
    [2,]  2  B  b
    [3,]  3  C  c
    
    0 讨论(0)
  • 2020-12-16 03:52

    If your data is as simple as it is above joran's answer is likely the simplest way. Here's may approach in base:

    a <- data.frame(id = 1:3, v1 = c('a', NA, NA), v2 = c(NA, 'b', 'c'))
    b <- data.frame(id = 1:3, v1 = c(NA, 'B', 'C'), v2 = c("A", NA, NA))
    
    decider <- function(x, y) factor(ifelse(is.na(x), as.character(y), as.character(x)))
    data.frame(mapply(a, b, FUN = decider))
    

    If your data has different id's (some overlap and some do not, then here's a different approach:

    a <- data.frame(id = c(1,2,4,5), v1 = c('a', NA, "q", NA), v2 = c(NA, 'b', 'c', "e"))
    b <- data.frame(id = 1:4, v1 = c(NA, "A", "C", 'B'), v2 = c("A", NA, "D", NA))
    
    decider <- function(x, y) factor(ifelse(is.na(x), as.character(y), as.character(x)))
    
    DF <- data.frame(mapply(a, b, FUN = decider))
    DF2 <- rbind(b[!b$id %in% DF$id , ], DF)
    DF2 <- DF2[order(DF2$id), ]
    rownames(DF2) <- 1:nrow(DF2)
    
    0 讨论(0)
提交回复
热议问题