问题
I have 3 df's:
df1=pd.DataFrame({"Name":["one","two","three"],"value":[4,5,6]})
df2=pd.DataFrame({"Name":["four","one","three"],"value":[8,6,2]})
df3=pd.DataFrame({"Name":["one","four","six"],"value":[1,1,1]})
I can append one by one but I want to append all the three data frames at a time and do some analysis.
I am trying to count the name contains in how many data frame divided by total dataframes name present in dataframes/total dataframes
My desired output is,
Name value Count
one 11 1
two 5 0.333
three 8 0.666
four 9 0.666
six 1 0.333
Please help, thanks in advance!
回答1:
Use:
- first concat
- aggregate by agg
- divide column
dfs = [df1, df2, df3]
df = pd.concat(dfs)
df1 = df.groupby('Name')['value'].agg([('value', 'sum'), ('Count', 'size')]).reset_index()
df1['Count'] /= len(dfs)
Similar solution:
df1 = (pd.concat(dfs)
.groupby('Name')['value']
.agg([('value', 'sum'), ('Count', 'size')])
.assign(Count = lambda x: x.Count /len(dfs))
.reset_index())
print (df1)
Name value Count
0 four 9 0.666667
1 one 11 1.000000
2 six 1 0.333333
3 three 8 0.666667
4 two 5 0.333333
来源:https://stackoverflow.com/questions/49067073/how-to-append-two-or-more-dataframes-in-pandas-and-do-some-analysis