Python 3.6: Deal with MemoryError

独自空忆成欢 提交于 2019-12-11 08:13:58

问题


There is a software I have written for 'machine learning' mission.

To do this, I need to load a lot of data into the RAM of the program (for the required 'fit' function).
In practice, in the spoken run, the 'load_Data' function should return 2 'ndarrays' (from 'numpy' library) of approximately 12,000 to 110,000 size of float64 type.

I get Memory Error during the run.
I tested the program on a smaller dataset (2,000 by 110,000 array) and it does work properly.

There are 2 solutions I have thought about:
1. Use a computer with more RAM (now I am using 8 GB RAM).
2. Use in 'fit' method 10 times, each time on another part of all dataset.

So, I want to ask:
Is solution #2 is a good solution?
There are more solutions?

Thanks very much.


回答1:


Of course the first solution is perfectly fine, but rather expensive. But what are you going to do once you have a data set of many hundreds of gigabytes? It's prohibitive for most consumers to purchase that much RAM.

Indeed, batching (as you hinted at) is the most common way to train on really large data sets. Most machine learning toolkits allow you to provide your data in batches. As you have not hinted which one you use, I'll defer to e.g. the Keras documentation on how to set this up.

Edit for scikit-learn, one can look here for a list of estimators that support batching.



来源:https://stackoverflow.com/questions/51911416/python-3-6-deal-with-memoryerror

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!