Mahout runs out of heap space

限于喜欢 提交于 2019-12-04 08:36:07

You're not specifying what process ran out of memory, which is important. You need to set MAHOUT_HEAPSIZE, not whatever JAVA_HEAP_MAX is.

Did you modify the heap size for the hadoop environment or the mahout one? See if this query on mahout list helps. From personal experience, I can suggest that you reduce the data size that you are trying to process. Whenever I tried to execute the Bayes classifier on my laptop, after running for a few hours, the heap space would get exhausted.

I'd suggest that you run this off EC2. I think the basic S3/EC2 option is free for usage.

When you start mahout process, you can runn "jps" it will show all the java process running on your machine with your user-id. "jps" will return you a process-id. You can find the process and can run "jmap -heap process-id" to see your heap space utilization.

With this approach you can estimate at which part of your processing memory is exhausted and where you need to increase.

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!