Nearest neighbor search in 2D using a grid partitioning

99封情书 提交于 2019-12-05 13:20:49

Since the dimensions of your bitmap are not large and you want to calculate the closest point for every (x,y), you can use dynamic programming.

Let V[i][j] be the distance from (i,j) to the closest point in the set, but considering only the points in the set that are in the "rectangle" [(1, 1), (i, j)].

Then V[i][j] = 0 if there is a point in (i, j), or V[i][j] = min(V[i'][j'] + dist((i, j), (i', j'))) where (i', j') is one of the three neighbours of (i,j):

i.e.

  • (i - 1, j)
  • (i, j - 1)
  • (i - 1, j - 1)

This gives you the minimum distance, but only for the "upper left" rectangle. We do the same for the "upper right", "lower left", and "lower right" orientations, and then take the minimum.

The complexity is O(size of the plane), which is optimal.

For you task usually a Point Quadtree is used, especially when the points are not evenly distributed.

To save main memory you als can use a PM or PMR-Quadtree which uses buckets.

You search in your cell and in worst case all quad cells surounding the cell.

You can also use a k-d tree.

One solution would be to construct multiple partitionings with different grid sizes.

Assume you create partitions at levels 1,2,4,8,..

Now, search for a point in grid size 1 (you are basically searching in 9 squares). If there is a point in the search area and if distance to that point is less than 1, stop. Otherwise move on to the next grid size.

The number of grids you need to construct is about twice as compared to creating just one level of partitioning.

A solution im trying

  • First make a grid such that you have an average of say 1 (more if you want larger scan) points per box.
  • Select the center box. Continue selecting neighbor boxes in a circular manner until you find at least one neighbor. At this point you can have 1 or 9 or so on boxes selected
  • Select one more layer of adjacent boxes
  • Now you have a fairly small list of points, usually not more than 10 which you can punch into the distance formula to find the nearest neighbor.

Since you have on average 1 points per box, you will mostly be selecting 9 boxes and comparing 9 distances. Can adjust grid size according to your dataset properties to achieve better results.

Also, if your data has a lot of variance, you can try 2 levels of grid (or even more) so if selection works and returns more than 50 points in a single query, start a next grid search with a grid 1/10th the size ...

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!