Comparison search time between K-D tree and Brute-force

南笙酒味 提交于 2019-12-25 02:22:52

问题


This is a graph of the execution speed according to the dimension of the k - d tree and brute-force that I wrote. The number of pointer sets was fixed at 1 M (1,000,000), and Query measured the speed performed 1000 times. The increase in the k - d tree is huge, But brute-force is not. I wonder why these results have come out and how they can be improved.


回答1:


Some ideas:

  • The performance may depend a lot on the characteristics of the data. For example, are the data points evenly distributed, clustered or otherwise arranged?

  • Also, what is the kind of query you are performing? One explanation would be that you are using a window-query that returns the whole point set, or large parts of it. In that case, brute force will always be faster.

  • Is there maybe a flaw in the KD-Tree implementation?

Generally it is known that kD-Trees don't scale very well with high dimensionality. So, for example in machine learning, dimensionality is often reduced to be around 10 to 20. However, unless you do the brute force on a GPU, KD-Tree should be faster.

If you are looking for structures that scale better with high dimensions (insertion / window-query), have a look at R*Trees or the PH-Tree (the latter is self-advertisement and currently limited to 60 dimensions, but a high-dim version will be released this week). For k-nearest neighbor search, have a look at CoverTrees or BallTrees. If you are using Java, you can have a look at implementations in my repo. I also implemented an R*Tree here.



来源:https://stackoverflow.com/questions/50551877/comparison-search-time-between-k-d-tree-and-brute-force

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!