How can I choose eps and minPts (two parameters for DBSCAN algorithm) for efficient results?

谁都会走 提交于 2019-12-11 17:07:43

问题


What routine or algorithm should I use to provide eps and minPts parameters to DBSCAN algorithm for efficient results?


回答1:


The DBSCAN paper suggests to choose minPts based on the dimensionality, and eps based on the elbow in the k-distance graph.

In the more recent publication

Schubert, E., Sander, J., Ester, M., Kriegel, H. P., & Xu, X. (2017).
DBSCAN Revisited, Revisited: Why and How You Should (Still) Use DBSCAN.
ACM Transactions on Database Systems (TODS), 42(3), 19.

the authors suggest to use a larger minpts for large and noisy data sets, and to adjust epsilon depending on whether you get too large clusters (decrease epsilon) or too much noise (increase epsilon). Clustering requires iterations.

That paper was an interesting read, because it shows what can go wrong if you don't look at your data. People are too obsesses with performance metrics, and forget to look at the actual data.



来源:https://stackoverflow.com/questions/47533930/how-can-i-choose-eps-and-minpts-two-parameters-for-dbscan-algorithm-for-effici

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!