I am building a large data dictionary from a set of text files. As I read in the lines and process them, I append(dataline) to a list.
At some point th
I had a similar problem happening when evaluating an expression containing large numpy arrays (actually, one was sparse). I was doing this on a machine with 64GB of memory, of which only about 8GB was in use, so was surprised to get the MemoryError.
It turned out that my problem was array shape broadcasting: I had inadvertently duplicated a large dimension.
It went something like this:
(286577, 1) where I was expecting (286577). (286577, 130). (286577), I applied [:,newaxis] in the expression to bring it to (286577,1) so it would be broadcast to (286577,130). (286577,1) however, [:,newaxis] produced shape (286577,1,1) and the two arrays were both broadcast to shape (286577,286577,130) ... of doubles. With two such arrays, that comes to about 80GB!