发表新帖

发表新帖

DBSCAN on spark : which implementation

后端未结

关注

 4  2260

名媛妹妹 2020-12-28 21:09

I would like to do some DBSCAN on Spark. I have currently found 2 implementations:

https://github.com/irvingc/dbscan-on-spark
https://github.com/alito

4条回答

误落风尘 (楼主)

2020-12-28 22:01

I tested https://github.com/irvingc/dbscan-on-spark and can say that it consumes a lot of memory. For 400K dataset with smooth distribution i used -Xmx12084m and even in this case it works too long (>20 min). In addition, it is only fo 2D. I used project with maven, not sbt.

I tested also second implementation. This is still the best that I found. Unfortunately, the author does not support it since 2015. It really took some time to raise the version of the Spark and resolve the version conflicts. I needed it to deploy on aws.

0 讨论(0)

查看其它4个回答
发布评论:

提交评论
- 加载中...

热议问题