Dask running out of memory even with chunks
问题 I'm working with big CSV files and I need to make a Cartesian Product (merge operation) . I've tried to face the problem with Pandas (you can check Panda's code and a data format example for the same problem , here) without success due to memory errors. Now, I'm trying with Dask, which is supposed to manage huge datasets even when its size is bigger than the available RAM. First of all I read both CSV: from dask import dataframe as dd BLOCKSIZE = 64000000 # = 64 Mb chunks df1_file_path = '.