Correct way to obtain confidence interval with scipy

前端 未结 3 1071
长发绾君心
长发绾君心 2020-12-12 17:08

I have a 1-dimensional array of data:

a = np.array([1,2,3,4,4,4,5,5,5,5,4,4,4,6,7,8])

for which I want to obtain the 68% confidence interva

3条回答
  •  北荒
    北荒 (楼主)
    2020-12-12 17:47

    I just checked how R and GraphPad calculate confidence intervals, and they increase the interval in case of small sample size (n). E.g., more than 6-fold for n=2 compared to a large n. This code (based on shasan's answer) matches their confidence intervals:

    import numpy as np, scipy.stats as st
    
    # returns confidence interval of mean
    def confIntMean(a, conf=0.95):
      mean, sem, m = np.mean(a), st.sem(a), st.t.ppf((1+conf)/2., len(a)-1)
      return mean - m*sem, mean + m*sem
    

    For R, I checked against t.test(a). GraphPad's confidence interval of a mean page has "user level" info on the sample size dependency.

    Here the output for Gabriel's example:

    In [2]: a = np.array([1,2,3,4,4,4,5,5,5,5,4,4,4,6,7,8])
    
    In [3]: confIntMean(a, 0.68)
    Out[3]: (3.9974214366806184, 4.877578563319382)
    
    In [4]: st.norm.interval(0.68, loc=np.mean(a), scale=st.sem(a))
    Out[4]: (4.0120010966037407, 4.8629989033962593)
    

    Note that the difference between the confIntMean() and st.norm.interval() intervals is relatively small here; len(a) == 16 is not too small.

提交回复
热议问题