How to split csv file into respective csv files based on date in first column (python)?

你说的曾经没有我的故事 提交于 2021-02-07 22:54:11

问题


I have a large CSV with multiple years of electricity load data, and I would like to split it into multiple files on a month and year basis - i.e to return individual CSVs for Jan, Feb, Mar etc for 2013, 2014, 2015 etc.

I have reviewed a lot of the solutions in the forums, and have not had any luck. My current file is structured as follows;


01-Jan-11,1,34606,34677,35648,35685,31058,484,1730

01-Jan-11,2,35092,35142,36089,36142,31460,520,1730

01-Jan-11,3,34725,34761,36256,36234,31109,520,1730

01-Jan-11,4,33649,33698,35628,35675,30174,512,1730


The name of the column header in the date column is "SETTLEMENT DATE". I am familiar with using pandas and df's but cannot seem to wrap my head around how to do this one!

Thanks!


回答1:


Here is how you can do it.

First you read your data from your csv_file and save your cols like this:

df = pd.read_csv('path/input.csv')
cols = df.columns

then you add two columns to your dataframe which are Month and Year:

df['Year'] = df['SETTLEMENT DATE'].apply(lambda x: x.split('-')[-1])
df['Month'] = df['SETTLEMENT DATE'].apply(lambda x: x.split('-')[1])

All you have to do left is to write your dataframe in separate csv files either by month or year with each file carrying the name of the month or year as follows:

for i in set(df.Year): # for classified by years files
    filename = "path/"+i+".csv"
    df.loc[df.Year == i].to_csv(filename,index=False,columns=cols)

for i in set(df.Month): # for classified by months files
    filename = "path/"+i+".csv"
    df.loc[df.Month == i].to_csv(filename,index=False,columns=cols)

Hope this was helpful.



来源:https://stackoverflow.com/questions/45001847/how-to-split-csv-file-into-respective-csv-files-based-on-date-in-first-column-p

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!