问题
I have the following Python dataframe:
Type Actual Predicted
A 4 3
A 10 18
A 13 11
B 3 10
B 4 2
B 8 33
C 20 17
C 40 33
C 87 80
C 32 30
I have the code to calculate R^2 and RMSE but I don't know how to calculate it by distinct "Type".
For now, my methodology is breaking the larger table into three smaller tables consisting of only A, B, C values and then calculating R^2 and RMSE off each smaller table...then appending them back together.
But the above method is inefficient and I believe there should be an easier way?
Below is the format I want the results to produce when things are grouped:
Type R^2 RMSE
A value value
B value value
C value value
回答1:
Here is a groupby
method:
import numpy as np
import pandas as pd
from sklearn.metrics import r2_score, mean_squared_error
def r2_rmse( g ):
r2 = r2_score( g['Actual'], g['Predicted'] )
rmse = np.sqrt( mean_squared_error( g['Actual'], g['Predicted'] ) )
return pd.Series( dict( r2 = r2, rmse = rmse ) )
your_df.groupby( 'Type' ).apply( r2_rmse ).reset_index()
来源:https://stackoverflow.com/questions/47914428/python-dataframe-calculating-r2-and-rmse-using-groupby-on-one-column