问题
Similar to this question in R here, I get out of memory issues when running loops with grid search in H2O. In R, doing gc() during each loop did help. What is the proposed solution here?
回答1:
There appears to be no h2o.gc() function in the Python API. See "How can I debug memory issues?" in the FAQ. You could POST that back-end command (GarbageCollect) directly using the REST API if you suspect the problem is the back-end holding on to memory that it no longer should be. Studying the detailed logs, might help confirm if that is the case.
Wrapping up the advice from the comments:
- Use
h2o.remove()on H2O frames and models you no longer need, at the end of the loop. - Use
h2o.removeAll()if you do not need to keep anything around, and your loop will be re-loading all the data it needs. - Use
H2OGridSearchrather than your own loops and your own grid code.
I'd also add to be aware that cbind, rbind and any function that modifies an H2O frame will make a copy of the entire frame. Sometimes re-thinking the way you do your data munging steps can reduce the memory requirements.
来源:https://stackoverflow.com/questions/45435739/python-h2o-memory-management