How does one do a full join using data.table?

泪湿孤枕 提交于 2019-12-12 09:29:07

问题


In the data.table FAQ, the nomatch = NA parameter is said to be akin to an outer join. However, I haven't been able to get data.table to do a full outer join – only right outer joins.

For example:

a <- data.table("dog" = c(8:12), "cat" = c(15:19))

   dog cat
1:   8  15
2:   9  16
3:  10  17
4:  11  18
5:  12  19

b <- data.table("dog" = 1:10, "bullfrog" = 11:20)

    dog bullfrog
 1:   1       11
 2:   2       12
 3:   3       13
 4:   4       14
 5:   5       15
 6:   6       16
 7:   7       17
 8:   8       18
 9:   9       19
10:  10       20

setkey(a, dog)
setkey(b, dog)

a[b, nomatch = NA]

    dog cat bullfrog
 1:   1  NA       11
 2:   2  NA       12
 3:   3  NA       13
 4:   4  NA       14
 5:   5  NA       15
 6:   6  NA       16
 7:   7  NA       17
 8:   8  15       18
 9:   9  16       19
10:  10  17       20

So, nomatch = NA produces a right outer join (which is the default). What if I need a full join? For example:

merge(a, b, by = "dog", all = TRUE) 
# Or with plyr:
join(a, b, by = "dog", type = "full")

    dog cat bullfrog
 1:   1  NA       11
 2:   2  NA       12
 3:   3  NA       13
 4:   4  NA       14
 5:   5  NA       15
 6:   6  NA       16
 7:   7  NA       17
 8:   8  15       18
 9:   9  16       19
10:  10  17       20
11:  11  18       NA
12:  12  19       NA

Is that possible with data.table?


回答1:


You actually have it right there. Use merge.data.table which is exactly what you are doing when you call

merge(a, b, by = "dog", all = TRUE)

since a is a data.table, merge(a, b, ...) calls merge.data.table(a, b, ...)




回答2:


x= data.table(a=1:5,b=11:15)
y= data.table(a=c(1:4,6),c=c(101:104,106))

setkey(x,a)
setkey(y,a)

unique_keys <- unique(c(x[,a], y[,a]))
y[x[.(unique_keys), on="a"]  ] # Full Outer Join


来源:https://stackoverflow.com/questions/15170741/how-does-one-do-a-full-join-using-data-table

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!