how to append two or more dataframes in pandas and do some analysis

試著忘記壹切 提交于 2019-12-20 03:33:15

问题


I have 3 df's:

df1=pd.DataFrame({"Name":["one","two","three"],"value":[4,5,6]})
df2=pd.DataFrame({"Name":["four","one","three"],"value":[8,6,2]})
df3=pd.DataFrame({"Name":["one","four","six"],"value":[1,1,1]})

I can append one by one but I want to append all the three data frames at a time and do some analysis.

I am trying to count the name contains in how many data frame divided by total dataframes name present in dataframes/total dataframes

My desired output is,

 Name  value   Count
 one    11      1
 two    5       0.333
 three  8       0.666
 four   9       0.666
 six    1       0.333

Please help, thanks in advance!


回答1:


Use:

  • first concat
  • aggregate by agg
  • divide column

dfs = [df1, df2, df3]
df = pd.concat(dfs)

df1 = df.groupby('Name')['value'].agg([('value', 'sum'), ('Count', 'size')]).reset_index()
df1['Count'] /= len(dfs) 

Similar solution:

df1 = (pd.concat(dfs)
         .groupby('Name')['value']
         .agg([('value', 'sum'), ('Count', 'size')])
         .assign(Count = lambda x: x.Count /len(dfs))
         .reset_index())

print (df1)
    Name  value     Count
0   four      9  0.666667
1    one     11  1.000000
2    six      1  0.333333
3  three      8  0.666667
4    two      5  0.333333


来源:https://stackoverflow.com/questions/49067073/how-to-append-two-or-more-dataframes-in-pandas-and-do-some-analysis

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!