dbscan

Running DBSCAN in ELKI

笑着哭i 提交于 2019-11-27 07:21:34
问题 I am trying to cluster some geospatial data, and I previously tried the WEKA library. I found this benchmarking, and decided to try ELKI. Despite the advice to not use ELKI as a Java library (which is suppose to be less maintained than the UI), I incorporated it in my application, and I can say that I am quite happy about the results. The structures that it uses to store data, are far more efficient than the ones used by Weka, and the fact that it has the option of using a spatial index is

scikit-learn DBSCAN memory usage

£可爱£侵袭症+ 提交于 2019-11-26 18:39:37
UPDATED: In the end, the solution I opted to use for clustering my large dataset was one suggested by Anony-Mousse below. That is, using ELKI's DBSCAN implimentation to do my clustering rather than scikit-learn's. It can be run from the command line and with proper indexing, performs this task within a few hours. Use the GUI and small sample datasets to work out the options you want to use and then go to town. Worth looking into. Anywho, read on for a description of my original problem and some interesting discussion. I have a dataset with ~2.5 million samples, each with 35 features (floating

scikit-learn DBSCAN memory usage

瘦欲@ 提交于 2019-11-26 06:29:37
问题 UPDATED: In the end, the solution I opted to use for clustering my large dataset was one suggested by Anony-Mousse below. That is, using ELKI\'s DBSCAN implimentation to do my clustering rather than scikit-learn\'s. It can be run from the command line and with proper indexing, performs this task within a few hours. Use the GUI and small sample datasets to work out the options you want to use and then go to town. Worth looking into. Anywho, read on for a description of my original problem and