问题
I have a dataframe, with two columns and 5000 rows. like: A B 0 1 4 1 5 5 2 3 2 3 9 7 ... 5000 8 3
I want to separate the dataframe every 100 steps. So I get 50 slices. For a training, what's I want to do next is to combine the 50 slices again into a new dataframe or array or everything that I can output the data into csv file.
I used the command following to separate the dataframe into slices:
df_original=pd.read_csv('/data.csv')
df=pd.DataFrame(df_original, columns=['A','B'])
for i in range(0,len(df['A']),100):
df_100=df[i:i+100]
After doing the command above, how can I combine the slices for next step? Any advice would be helpful. Thank you so much.
回答1:
If you want have 50 csv files:
for i in range(0,len(df['A']),100):
df_100=df[i:i+100]
df_100.to_csv("file"+str(i)+".csv", index=False)
If you want to do some process to those sliced dataframes, you can store them in as dictionary:
dict_of_df = {}
for i in range(0,len(df['A']),100):
dict_of_df["slice{}".format(i)]=df[i:i+100]
So you will access to sliced dataframe by dict_of_df[key]
, where key = "slice0", "slice100", "slice200", ...
When you have done with those sliced dataframes and want to combine them,
df_final = pd.DataFrame()
for key, values in dict_of_df.items():
df_final = df_final.append(dict_of_df[key])
Check if df_final isn't sorted well, then:
df_final = df_final.sort_index()
And export back to csv: df_final.to_csv("filename.csv")
来源:https://stackoverflow.com/questions/52456063/python-pandas-how-to-combine-slices-after-using-for