Merge dataframes on matching A, B and *closest* C?

前端 未结 3 1234
走了就别回头了
走了就别回头了 2020-12-05 11:14

I have two dataframes like so:

set.seed(1)
df <- cbind(expand.grid(x=1:3, y=1:5), time=round(runif(15)*30))
to.merge <- data.frame(x=c(2, 2, 2, 3, 2),
         


        
3条回答
  •  [愿得一人]
    2020-12-05 11:57

    Using merge couple of times and aggregate once, here is how to do it.

    set.seed(1)
    df <- cbind(expand.grid(x = 1:3, y = 1:5), time = round(runif(15) * 30))
    to.merge <- data.frame(x = c(2, 2, 2, 3, 2), y = c(1, 1, 1, 5, 4), time = c(17, 12, 11.6, 22.5, 2), val = letters[1:5], stringsAsFactors = F)
    
    #Find rows that match by x and y
    res <- merge(to.merge, df, by = c("x", "y"), all.x = TRUE)
    res$dif <- abs(res$time.x - res$time.y)
    res
    ##   x y time.x val time.y dif
    ## 1 2 1   17.0   a     11 6.0
    ## 2 2 1   12.0   b     11 1.0
    ## 3 2 1   11.6   c     11 0.6
    ## 4 2 4    2.0   e      6 4.0
    ## 5 3 5   22.5   d     23 0.5
    
    #Find rows that need to be merged
    res1 <- merge(aggregate(dif ~ x + y, data = res, FUN = min), res)
    res1
    ##   x y dif time.x val time.y
    ## 1 2 1 0.6   11.6   c     11
    ## 2 2 4 4.0    2.0   e      6
    ## 3 3 5 0.5   22.5   d     23
    
    #Finally merge the result back into df
    final <- merge(df, res1[res1$dif <= 1, c("x", "y", "val")], all.x = TRUE)
    final
    ##    x y time  val
    ## 1  1 1    8 
    ## 2  1 2   27 
    ## 3  1 3   28 
    ## 4  1 4    2 
    ## 5  1 5   21 
    ## 6  2 1   11    c
    ## 7  2 2    6 
    ## 8  2 3   20 
    ## 9  2 4    6 
    ## 10 2 5   12 
    ## 11 3 1   17 
    ## 12 3 2   27 
    ## 13 3 3   19 
    ## 14 3 4    5 
    ## 15 3 5   23    d
    

提交回复
热议问题