WRITE only first N rows from pandas df to csv

*爱你&永不变心* 提交于 2021-02-08 18:43:51

问题


How can I write only first N rows or from P to Q rows to csv from pandas dataframe without subseting the df first? I cannot subset the data I want to export because of memory issues.

I am thinking of a function which writes to csv row by row.

Thank you


回答1:


  • Use head- Return the first n rows.

Ex.

import pandas as pd
import numpy as np
date = pd.date_range('20190101',periods=6)
df = pd.DataFrame(np.random.randn(6,4), index=date, columns=list('ABCD'))

#wtire only top two rows into csv file
print(df.head(2).to_csv("test.csv"))



回答2:


Does this work for you?

df.iloc[:N, :].to_csv()

Or

df.iloc[P:Q, :].to_csv()

I believe df.iloc generally produces references to the original dataframe rather than copying the data.

If this still doesn't work, you might also try setting the chunksize in the to_csv call. It may be that pandas is able to create the subset without using much more memory, but then it makes a complete copy of the rows written to each chunk. If the chunksize is the whole frame, you would end up copying the whole frame at that point and running out of memory.

If all else fails, you can loop through df.iterrows() or df.iloc[P:Q, :].iterrows() or df.iloc[P:Q, :].itertuples() and write each row using the csv module (possibly writer.writerows(df.iloc[P:Q, :].itertuples()).




回答3:


Maybe you can select the rows index that you want to write on your CSV file like this:

df[df.index.isin([1, 2, ...])].to_csv('file.csv')

Or use this one:

df.loc[n:n].to_csv('file.csv')


来源:https://stackoverflow.com/questions/57458771/write-only-first-n-rows-from-pandas-df-to-csv

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!