Sparse Matrix as input to Hierarchical clustering in R

心已入冬 提交于 2019-12-22 09:49:15

问题


I have a question about clustering using a distance matrix, but sparse.

Is there a sparse distance object format that does not expand the matrix and can work with the sparse representation?

Currently I'm doing the following

# read sparse matrix
sparse <- readMM('sparse-matrix')
distance <- as.dist(sparse)

sparse-matrix is already the correct distance matrix, which has NA's for entries that are not connected.

>sparse
[1,] . . .
[2,] 1 . .
[3,] 1 . .

> as.dist(sparse)
1 2
2 1  
3 1 0

But converting it with as.dist fails with

Error in asMethod(object) : negative length vectors are not allowed

Presumably, because it expands the matrix to a complete form. The matrix (NxN) size is N = 49281 This format is needed (dist object) by for example the hclust method

Similar Question without any answer on the R help list


回答1:


How would a distance matrix be sparse? There is a distance between each two objects, so it is actually a very dense matrix. However, a triangular matrix is sufficient to describe the mutual distances (as D = D'). This is actually the case for the objects produced by dist.

If the distance matrix is sparse because lots of objects are the same, then maybe you'd want to calculate the distance matrix only on unique objects.



来源:https://stackoverflow.com/questions/15911022/sparse-matrix-as-input-to-hierarchical-clustering-in-r

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!