R: Counting the number of matches between multiple data frames

百般思念 提交于 2019-12-24 07:25:21

问题


I want to find the number of matches based on ID of unique matches within multiple data.frames

Data looks like this:

df1: KeyID
       x
       x
       y
       y
       z

df2: KeyID
       x
       x
       x
       z
       z

df3: KeyID
       x
       y
       y
       z

I want to count the number of unique matches between data frames.

output would look like this: 2

Since x and z are the only matches between the two sets.

I have done this but want to know if there is a faster way:

df1.2 <- df2[df2$KeyID %in% df1$KeyID,]
length(unique(df1.2$KeyID))

Any thoughts?


回答1:


You can do set intersection with intersect:

v1 <- c("x", "x", "y", "y", "z")
v2 <- c("x", "x", "x", "z", "z")
intersect(v1, v2)
# [1] "x" "z"
length(intersect(v1, v2))
# [1] 2

Edit: Adapting for the question edit, as per akrun's suggestion, if there are multiple vectors,

v1 <- c("x", "x", "y", "y", "z")
v2 <- c("x", "x", "x", "z", "z")
v3 <- c("x", "y", "y", "z")
vector.list <- list(v1, v2, v3)

Reduce("intersect", vector.list)
# [1] "x" "z"


来源:https://stackoverflow.com/questions/25045496/r-counting-the-number-of-matches-between-multiple-data-frames

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!