p-value from fisher.test() does not match phyper()

感情迁移 提交于 2021-02-19 04:42:16

问题


The Fisher's Exact Test is related to the hypergeometric distribution, and I would expect that these two commands would return identical pvalues. Can anyone explain what I'm doing wrong that they do not match?

#data (variable names chosen to match dhyper() argument names)
x = 14
m = 20
n = 41047
k = 40

#Fisher test, alternative = 'greater'
(fisher.test(matrix(c(x, m-x, k-x, n-(k-x)),2,2), alternative='greater'))$p.value 
#returns 2.01804e-39

#geometric distribution, lower.tail = F, i.e. P[X > x]
phyper(x, m, n, k, lower.tail = F, log.p = F)
#returns 5.115862e-43

回答1:


In this case, the actual call to phyper that is relevant is phyper(x - 1, m, n, k, lower.tail = FALSE). Look at the source code for fisher.test relevant to your call of fisher.test(matrix(c(x, m-x, k-x, n-(k-x)),2,2), alternative='greater'). At line 138, PVAL is set to:

switch(alternative, less = pnhyper(x, or), 
    greater = pnhyper(x, or, upper.tail = TRUE), 
    two.sided = {
      if (or == 0) as.numeric(x == lo) else if (or == 
        Inf) as.numeric(x == hi) else {
        relErr <- 1 + 10^(-7)
        d <- dnhyper(or)
        sum(d[d <= d[x - lo + 1] * relErr])
      }
    })

Since alternative = 'greater', PVAL is set to pnhyper(x, or, upper.tail = TRUE). You can see pnhyper defined on line 122. Here, or = 1, which is passed to ncp, so the call is phyper(x - 1, m, n, k, lower.tail = FALSE)

With your values:

x = 14
m = 20
n = 41047
k = 40
phyper(x - 1, m, n, k, lower.tail = FALSE)
# [1] 2.01804e-39


来源:https://stackoverflow.com/questions/53051977/p-value-from-fisher-test-does-not-match-phyper

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!