Confidence Interval in Python dataframe

后端 未结 2 441
刺人心
刺人心 2021-01-06 00:11

I am trying to calculate the mean and confidence interval(95%) of a column \"Force\" in a large dataset. I need the result by using the groupby function by grouping differen

2条回答
  •  陌清茗
    陌清茗 (楼主)
    2021-01-06 01:03

    import pandas as pd
    import numpy as np
    import math
    
    df=pd.DataFrame({'Class': ['A1','A1','A1','A2','A3','A3'], 
                     'Force': [50,150,100,120,140,160] },
                     columns=['Class', 'Force'])
    print(df)
    print('-'*30)
    
    stats = df.groupby(['Class'])['Force'].agg(['mean', 'count', 'std'])
    print(stats)
    print('-'*30)
    
    ci95_hi = []
    ci95_lo = []
    
    for i in stats.index:
        m, c, s = stats.loc[i]
        ci95_hi.append(m + 1.96*s/math.sqrt(c))
        ci95_lo.append(m - 1.96*s/math.sqrt(c))
    
    stats['ci95_hi'] = ci95_hi
    stats['ci95_lo'] = ci95_lo
    print(stats)
    

    The output is

      Class  Force
    0    A1     50
    1    A1    150
    2    A1    100
    3    A2    120
    4    A3    140
    5    A3    160
    ------------------------------
           mean  count        std
    Class                        
    A1      100      3  50.000000
    A2      120      1        NaN
    A3      150      2  14.142136
    ------------------------------
           mean  count        std     ci95_hi     ci95_lo
    Class                                                
    A1      100      3  50.000000  156.580326   43.419674
    A2      120      1        NaN         NaN         NaN
    A3      150      2  14.142136  169.600000  130.400000
    

提交回复
热议问题