How to apply DBSCAN algorithm on grouping of similar url [closed]

时光毁灭记忆、已成空白 提交于 2019-12-20 07:56:13

问题


how to group similar url using the DBSCAN algorithm. I have seen many datasets but none were on url , I want to take similar type of urls and group it together. Here i am not able to know distance (eps) and minpoints can be the number of urls to be grouped.


回答1:


DBSCAN needs a distance function and a threshold for detecting similar objects.

So go ahead, first you need to define an appropiate distance function and a threshold, then we can help you with DBSCAN (but you should be able to find DBSCAN implementations that can be extened to arbitrary distance functions).

The key challenge is the distance, and this is up to you, because we do not know what you want to get out. This is very subjective, and we just don't know what you want or need.



来源:https://stackoverflow.com/questions/12422576/how-to-apply-dbscan-algorithm-on-grouping-of-similar-url

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!