Recalculating distance matrix

三世轮回 提交于 2019-12-03 23:10:34

I'll assume your extra row means an extra point. If it means an extra variable/dimension, it will call for a different answer.

First of all, for euclidean distance of matrices, I'd recommend the rdist function from the fields package. It is written in Fortran and is a lot faster than the dist function. It returns a matrix instead of a dist object, but you can always go from one to the other using as.matrix and as.dist.

Here is (smaller than yours) sample data

num.points <- 400
num.vars   <- 1000
original.points <- matrix(runif(num.points * num.vars),
                          nrow = num.points, ncol = num.vars)

and the distance matrix you already computed:

d0 <- rdist(original.points)

For the extra point(s), you only need to compute the distances among the extra points and the distances between the extra points and the original points. I will use two extra points to show that the solution is general to any number of extra points:

extra.points <- matrix(runif(2 * num.vars), nrow = 2)
inner.dist   <- rdist(extra.points)
outer.dist   <- rdist(extra.points, original.points)

so you can bind them to your bigger distance matrix:

d1 <- rbind(cbind(d0, t(outer.dist)),
            cbind(outer.dist, inner.dist))

Let's check that it matches what a full, long rerun would have produced:

d2 <- rdist(rbind(original.points, extra.points))

identical(d1, d2)
# [1] TRUE
标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!