Pandas to_csv() slow saving large dataframe

后端 未结 3 2072
攒了一身酷
攒了一身酷 2020-12-17 10:35

I\'m guessing this is an easy fix, but I\'m running into an issue that it\'s taking nearly an hour to save a pandas dataframe to a csv file using the to_csv()

3条回答
  •  感动是毒
    2020-12-17 11:02

    You said "[...] of mostly numeric (decimal) data.". Do you have any column with time and/or dates?

    I saved an 8 GB CSV in seconds when it has only numeric/string values, but it takes 20 minutes to save an 500 MB CSV with two Dates columns. So, what I would recommend is to convert each date column to a string before saving it. The following command is enough:

    df['Column'] = df['Column'].astype(str) 
    

    I hope that this answer helps you.

    P.S.: I understand that saving as a .hdf file solved the problem. But, sometimes, we do need a .csv file anyway.

提交回复
热议问题