How to find the optimal point for DBSCAN() parameters in R

a 夏天 提交于 2019-12-31 07:04:27

问题


How to find the optimal point and appropriate amount for DBSCAN() parameters(eps,Minpts)?

DBSCAN() from package fpc implements the DBSCAN(Density based clustering) clustering method.


回答1:


You can find strategies for choosing minPts and epsilon discussed in the original DBSCAN paper:

Ester, M., Kriegel, H. P., Sander, J., & Xu, X. (1996, August). A density-based algorithm for discovering clusters in large spatial databases with noise. In KDD (Vol. 96, No. 34, pp. 226-231).

Also read up on some newer developments:

Schubert, E., Sander, J., Ester, M., Kriegel, H. P., & Xu, X. (2017). DBSCAN Revisited, Revisited: Why and How You Should (Still) Use DBSCAN. ACM Transactions on Database Systems (TODS), 42(3), 19.

This newer article also discusses how to set, and how to not set the parameters. It provides some interesting insight what can go wrong.

I didn't find an open access version of this article, but you can use Sci-Hub (Wikipedia).

And, of course, if choosing epsilon is difficult, you may want to use OPTICS or HDBSCAN* instead.




回答2:


This is discussed in ?dbscan in package dbscan:

"Setting parameters for DBSCAN: minPts is often set to be dimensionality of the data plus one or higher. The knee in kNNdistplot can be used to find suitable values for eps."



来源:https://stackoverflow.com/questions/47110357/how-to-find-the-optimal-point-for-dbscan-parameters-in-r

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!