Mallet topic modelling

旧城冷巷雨未停 提交于 2019-12-05 08:17:50

In bin/mallet.bat increase value for this line:

set MALLET_MEMORY=1G

I'm not sure about scalability of Mallet to big data, but project http://dragon.ischool.drexel.edu/ can store its data in disk backed persistence therefore can scale to unlimited corpus sizes(with low performance of course)

The model is still going to be pretty much huge, even if it read it from multiple files. Have you tried increasing the heap size of your java vm?

java.lang.outofmemory exception occurs mainly because of insufficient heap space. You can use -Xms and -Xmx to set heap space so that it will not come again.

Given the current PC's memory size, it should be easy to use a heap as large as 2GB. You should try the single-machine solution before considering using a cluster.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!