Find nearest points of latitude and longitude from different data sets with different length

前端 未结 5 2176
你的背包
你的背包 2020-12-19 08:24

I have two data set of different stations. The data are basically data.frames with coordinates, longitudes and latitudes. Given the first data set (or vice versa), I want to

5条回答
  •  一生所求
    2020-12-19 09:20

    If you have extremely large datasets, using a distance command can be cumbersome as it must calculate the distance to all points in the alternative data for each point in the reference data. The 'ann' command from the 'yaImpute' package is a very fast approximate nearest-neighbour routine that is good for large distance calculations. It will return however many "closest" records you want (the value of k) as well as the distance to each of them.

    Note: despite being an approximate nearest neighbour, the results are stable on repeated runs of the same data. It doesn't include a random selection of points or anything. See documentation.

    FWIW, I'm really not kidding about fast. I've used this to find knn distances for two matrices, each with millions of points. Making a distance matrix for this or doing it iteratively row-by-row is either unfeasible or painfully slow.

    Quick example:

    # Hypothetical coordinate data
    set.seed(2187); foo1 <- round(abs(data.frame(x=runif(5), y=runif(5))*100))
    set.seed(2187); foo2 <- round(abs(data.frame(x=runif(10), y=runif(10))*100))
    foo1; foo2
    
    # the 'ann' command from the 'yaImpute' package
    install.packages("yaImpute")
    library(yaImpute)
    
    # Approximate nearest-neighbour search, reporting 2 nearest points (k=2)
    # This command finds the 3 nearest points in foo2 for each point in foo1
    # In the output:
    #   The first k columns are the row numbers of the points
    #   The next k columns (k+1:2k) are the *squared* euclidean distances
    knn.out <- ann(as.matrix(foo2), as.matrix(foo1), k=3)
    knn.out$knnIndexDist
    
         [,1] [,2] [,3] [,4] [,5] [,6]
    [1,]    1    5    4  729 1658 2213
    [2,]    2    3    7   16  100 1025
    [3,]    9    7    5   40   81  740
    [4,]    4    1    6   16  580  673
    [5,]    5    7    9    0  677  980
    

    https://cran.r-project.org/web/packages/yaImpute/index.html

提交回复
热议问题