What's the fastest way to merge/join data.frames in R?

前端 未结 5 1549
走了就别回头了
走了就别回头了 2020-11-27 08:58

For example (not sure if most representative example though):

N <- 1e6
d1 <- data.frame(x=sample(N,N), y1=rnorm(N))
d2 <- data.frame(x=sample(N,N),          


        
5条回答
  •  孤街浪徒
    2020-11-27 09:42

    For simple task (unique values on both sides of join) I use match:

    system.time({
        d <- d1
        d$y2 <- d2$y2[match(d1$x,d2$x)]
    })
    

    It's far more faster than merge (on my machine 0.13s to 3.37s).

    My timings:

    • merge: 3.32s
    • plyr: 0.84s
    • match: 0.12s

提交回复
热议问题