Calculating Euclidean Distance for Large DataSets

 ̄綄美尐妖づ 提交于 2019-12-12 13:44:44

问题


I have to calculate Euclidean distance between train and test data. the total length of train data is 1389 and for test data is 364. It is basically the data from the handwritten ZIP codes on envelopes from U.S. postal mail, downloaded from the website of "Elements of Statistical learning".

I am a beginner and just read the data in R package. I'm unable to start calculating distance between train and test data. Can anyone help me out to give me an idea that how to generate a loop for this data?

I would be thankful.


回答1:


For Euclidian distances, I like using rdist from the fields packages. One advantage over dist from the stats package, is that it can take two matrices as input:

train.data <- matrix(runif(1389*2), ncol = 2)
test.data  <- matrix(runif(364*2),  ncol = 2)

library(fields)
distances <- rdist(train.data, test.data)
dim(distances)
# [1] 1389  364


来源:https://stackoverflow.com/questions/10220640/calculating-euclidean-distance-for-large-datasets

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!