What's the fastest way to merge/join data.frames in R?

前端 未结 5 1512
走了就别回头了
走了就别回头了 2020-11-27 08:58

For example (not sure if most representative example though):

N <- 1e6
d1 <- data.frame(x=sample(N,N), y1=rnorm(N))
d2 <- data.frame(x=sample(N,N),          


        
5条回答
  •  独厮守ぢ
    2020-11-27 10:00

    By using the merge function and its optional parameters:

    Inner join: merge(df1, df2) will work for these examples because R automatically joins the frames by common variable names, but you would most likely want to specify merge(df1, df2, by = "CustomerId") to make sure that you were matching on only the fields you desired. You can also use the by.x and by.y parameters if the matching variables have different names in the different data frames.

    Outer join: merge(x = df1, y = df2, by = "CustomerId", all = TRUE)
    
    Left outer: merge(x = df1, y = df2, by = "CustomerId", all.x = TRUE)
    
    Right outer: merge(x = df1, y = df2, by = "CustomerId", all.y = TRUE)
    
    Cross join: merge(x = df1, y = df2, by = NULL)
    

提交回复
热议问题