Euclidean Distances between rows of two data frames in R

做~自己de王妃 提交于 2021-02-17 03:29:36

问题


Calculating Euclidean Distances in R is easy. A good example can be found HERE. The vectorised form is:

sqrt((known_data[, 1] - unknown_data[, 1])^2 + (known_data[, 2] - unknown_data[, 2])^2)

What would be the fastest, most efficient way to get Euclidean Distances for each row of one data frame with all rows of another data frame? A particular function from apply() family? Thanks!


回答1:


Maybe you can try outer + dist like below

outer(
  1:nrow(known_data),
  1:nrow(unknown_data),
  FUN = Vectorize(function(x,y) dist(rbind(known_data[x,],unknown_data[y,])))
)



回答2:


I would use the dist() function (which is very efficient) on the combination of the two data frames and then remove the unneeded distances, if you like. Example:

df1 <- iris[1:5, -5]
df2 <- iris[6:10, -5]

all_distances <- dist(rbind(df1, df2))
all_distances <- as.matrix(all_distances)

# remove unneeded distances
all_distances[1:5, 1:5] <- NA
all_distances[6:10, 6:10] <- NA


来源:https://stackoverflow.com/questions/64269505/euclidean-distances-between-rows-of-two-data-frames-in-r

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!