ValueError: cannot insert ID, already exists

北城余情 提交于 2021-02-05 20:22:37

问题


I have this data:

ID   TIME
1    2
1    4
1    2
2    3

I want to group the data by ID and calculate the mean time and the size of each group.

ID   MEAN_TIME COUNT
1    2.67      3
2    3.00      1

If I run this code, then I get an error "ValueError: cannot insert ID, already exists":

result = df.groupby(['ID']).agg({'TIME': 'mean', 'ID': 'count'}).reset_index()

回答1:


Use parameter drop=True which not create new column with index but remove it:

result = df.groupby(['ID']).agg({'TIME': 'mean', 'ID': 'count'}).reset_index(drop=True)
print (result)
   ID      TIME
0   3  2.666667
1   1  3.000000

But if need new column from index need rename old column names first:

result = df.groupby(['ID']).agg({'TIME': 'mean', 'ID': 'count'})
           .rename(columns={'ID':'COUNT','TIME':'MEAN_TIME'})
           .reset_index()
print (result)
   ID  COUNT  MEAN_TIME
0   1      3   2.666667
1   2      1   3.000000

Solution if need aggreagate by multiple columns:

result = df.groupby(['ID']).agg({'TIME':{'MEAN_TIME': 'mean'}, 'ID': {'COUNT': 'count'}})
result.columns = result.columns.droplevel(0)
print (result.reset_index())
   ID  COUNT  MEAN_TIME
0   1      3   2.666667
1   2      1   3.000000



回答2:


I'd limit my groupby to just the TIME column.

df.groupby(['ID']).TIME.agg({'MEAN_TIME': 'mean', 'COUNT': 'count'}).reset_index()

   ID  MEAN_TIME  COUNT
0   1   2.666667      3
1   2   3.000000      1


来源:https://stackoverflow.com/questions/41576242/valueerror-cannot-insert-id-already-exists

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!