How to index a pandas data frame starting at n?

问题

Is it possible to start the index from n in a pandas dataframe?

I have some datasets saved as csv files, and would like to add the column index with the row number starting from where the last row number ended in the previous file.

For example, for the first file I'm using the following code which works fine, so I got an output csv file with rows starting at 1 to 1048574, as expected:

yellow_jan['index'] = range(1, len(yellow_jan) + 1)

I would like to do same for the yellow_feb file, but starting the row index at 1048575 and so on.

Appreciate any help!

回答1:

you may just reset the index at the end or define a local variable and use it in `arange' function. update the variable with the numbers of rows for each file you read.

回答2:

If your plan is to concat the dataframe you can just use

import pandas as pd
import numpy as np
df1 = pd.DataFrame({"a": np.arange(10)})
df2 = pd.DataFrame({"a": np.arange(10,20)})
df = pd.concat([df1, df2],ignore_index=True)

otherwise

df2.index += len(df)

回答3:

df["new_index"] = range(10, 20)
df = df.set_index("new_index")
df

来源：https://stackoverflow.com/questions/47515644/how-to-index-a-pandas-data-frame-starting-at-n

标签

python

pandas

csv

dataframe

indexing

易学教程内所有资源均来自网络或用户发布的内容，如有违反法律规定的内容欢迎反馈！
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!