eps estimation for DBSCAN by not using the already suggested algorithm in the Original research paper

烈酒焚心 提交于 2019-12-23 06:26:05

问题


I have to implement DBSCAN using python, and the epsilon estimation has been posing problems as the already suggested method in the original research paper assumes blob like distribution of the dataset, where as in my case it is more of a cure fittable data with jumps at some intervals. The jumps cause the DBSCAN to form different clusters of various datasets in the intervals between jumps(which is good enough for me), but the epsilon calculation dynamically for different datasets does not produce desired results as the points tend to lie on a straight line for many intervals, and changing 'k' value cause a considerable change in the eps value.


回答1:


Try using OPTICS algorithm, you won't need to estimate eps in that. Also, I would suggest recursive regression, where you use the python's best curve fit scipy.optimize.curve_fit to get best curve, and then find the rms error of all the points wrt the curve. Then remove 'n' percent of points, and recursively repeat this untill your rms error is less than your threshold.



来源:https://stackoverflow.com/questions/30325591/eps-estimation-for-dbscan-by-not-using-the-already-suggested-algorithm-in-the-or

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!