问题
I have a csv file which has a size of around 800MB which I'm trying to load into a dataframe via pandas but I keep getting a memory error. I need to load it so I can join it to another smaller dataframe.
Why am I getting a memory error even though I'm using 64bit versions of Windows, and Python 3.4 64bit and have over 8GB of RAM and plenty of harddisk? Is this is a bug in Pandas? How can I solve this memory issue?
回答1:
reading your CSV in chunks might help:
chunk_size = 10**5
df = pd.concat([chunk for chunk in pd.read_csv(filename, chunksize=chunk_size)],
ignore_index=False)
来源:https://stackoverflow.com/questions/37836275/memory-error-in-pandas