Post-Hoc tests for chi-sq in R

问题

I have a table that looks like this.

> dput(theft_loc)
structure(c(13704L, 14059L, 14263L, 14450L, 14057L, 15503L, 14230L, 
16758L, 15289L, 15499L, 16066L, 15905L, 18531L, 19217L, 12410L, 
13398L, 13308L, 13455L, 13083L, 14111L, 13068L, 19569L, 18771L, 
19626L, 20290L, 19816L, 20923L, 20466L, 20517L, 19377L, 20035L, 
20504L, 20393L, 22409L, 22289L, 7997L, 8106L, 7971L, 8437L, 8246L, 
9090L, 8363L, 7934L, 7874L, 7909L, 8150L, 8191L, 8746L, 8277L, 
27194L, 25220L, 26034L, 27080L, 27334L, 30819L, 30633L, 10452L, 
10848L, 11301L, 11494L, 11265L, 11985L, 11038L, 12104L, 13368L, 
14594L, 14702L, 13891L, 12891L, 12939L), .Dim = c(7L, 10L), .Dimnames = structure(list(
    c("Sunday", "Monday", "Tuesday", "Wednesday", "Thursday", 
    "Friday", "Saturday"), c("BAYVIEW", "CENTRAL", "INGLESIDE", 
    "MISSION", "NORTHERN", "PARK", "RICHMOND", "SOUTHERN", "TARAVAL", 
    "TENDERLOIN")), .Names = c("", "")), class = "table")

I ran a chisq.test and the results came back significant. I now want to run some pairwise tests to see where the significance lies. I tried to use the fifer package and the chisq.post.test function but i get an error that says out of workspace.

What other ways can I run the multiple comparisons test?

回答1:

This will work (try chisq.test instead of the default fisher.test (exact) in post hoc test):

(Xsq <- chisq.test(theft_loc))  # Prints test summary, p-value very small,
#       Pearson's Chi-squared test
# data:  theft_loc
# X-squared = 1580.1, df = 54, p-value < 2.2e-16 # reject null hypothesis for independence

library(fifer)
chisq.post.hoc(theft_loc, test='chisq.test')

with output

 Adjusted p-values used the fdr method.

               comparison  raw.p  adj.p
1       Sunday vs. Monday 0.0000 0.0000
2      Sunday vs. Tuesday 0.0000 0.0000
3    Sunday vs. Wednesday 0.0000 0.0000
4     Sunday vs. Thursday 0.0000 0.0000
5       Sunday vs. Friday 0.0000 0.0000
6     Sunday vs. Saturday 0.0000 0.0000
7      Monday vs. Tuesday 0.0000 0.0000
8    Monday vs. Wednesday 0.0000 0.0000
9     Monday vs. Thursday 0.0000 0.0000
10      Monday vs. Friday 0.0000 0.0000
11    Monday vs. Saturday 0.0000 0.0000
12  Tuesday vs. Wednesday 0.1451 0.1451
13   Tuesday vs. Thursday 0.0000 0.0000
14     Tuesday vs. Friday 0.0000 0.0000
15   Tuesday vs. Saturday 0.0000 0.0000
16 Wednesday vs. Thursday 0.0016 0.0017
17   Wednesday vs. Friday 0.0000 0.0000
18 Wednesday vs. Saturday 0.0000 0.0000
19    Thursday vs. Friday 0.0000 0.0000
20  Thursday vs. Saturday 0.0000 0.0000
21    Friday vs. Saturday 0.0000 0.0000

As we can see, all the pairwise tests except a couple are significant, we can use different p-value-correction too (by changing the control from default fdr to bonferroni).

回答2:

Since fifer is no longer maintained here is a solution with RVAideMemoire (described in more detail here https://rdrr.io/cran/RVAideMemoire/src/R/chisq.multcomp.R):

install.packages("RVAideMemoire")
library(RVAideMemoire)
chisq.multcomp(theft_loc, p.method = "none")
>      7874    7909    7934    7971    7997    8106    8150    8191    8246    8277    8363    8437    8746    9090    10452   10848   11038   11265   11301  
7909  0.78056 -       -       -       -       -       -       -       -       -       -       -       -       -       -       -       -       -       -      
7934  0.63321 0.84256 -       -       -       -       -       -       -       -       -       -       -       -       -       -       -       -       -      
7971  0.44095 0.62272 0.76923 -       -       -       -       -       -       -       -       -       -       -       -       -       -       -       -      
7997  0.32889 0.48533 0.61768 0.83698 -       -       -       -       -       -       -       -       -       -       -       -       -       -       -      
8106  0.06647 0.11954 0.17444 0.28701 0.39036 -       -       -       -       -       -       -       -       -       -       -       -       -       -      
8150  0.02923 0.05720 0.08854 0.15860 0.22857 0.73002 -       -       -       -       -       -       -       -       -       -       -       -       -      
8191  0.01238 0.02625 0.04298 0.08354 0.12732 0.50552 0.74841 -       -       -       -       -       -       -       -       -       -       -       -      
8246  0.00339 0.00802 0.01417 0.03081 0.05073 0.27360 0.45342 0.66793 -       -       -       -       -       -       -       -       -       -       -      
8277  0.00152 0.00382 0.00706 0.01637 0.02817 0.18156 0.32174 0.50276 0.80943 -       -       -       -       -       -       -       -       -       -      
8363  0.00012 0.00037 0.00078 0.00216 0.00422 0.04522 0.09741 0.18128 0.36396 0.50497 -       -       -       -       -       -       -       -       -      
8437  1.0e-05 3.6e-05 8.5e-05 0.00027 0.00060 0.01007 0.02585 0.05643 0.13921 0.21586 0.56805 -       -       -       -       -       -       -       -      
8746  1.3e-11 8.8e-11 3.2e-10 2.0e-09 7.1e-09 8.2e-07 4.5e-06 2.0e-05 0.00013 0.00032 0.00341 0.01841 -       -       -       -       -       -       -      
9090  < 2e-16 < 2e-16 < 2e-16 < 2e-16 < 2e-16 6.2e-14 8.1e-13 8.0e-12 1.5e-10 6.9e-10 3.7e-08 8.1e-07 0.01000 -       -       -       -       -       -      
10452 < 2e-16 < 2e-16 < 2e-16 < 2e-16 < 2e-16 < 2e-16 < 2e-16 < 2e-16 < 2e-16 < 2e-16 < 2e-16 < 2e-16 < 2e-16 < 2e-16 -       -       -       -       -

Category are replaced by the counts of each category. I don't like corrections for multiple comparisons (see ref below for a discussion), but fdr is available.

Moran, M. D. (2003). Arguments for rejecting the sequential Bonferroni in ecological studies. Oikos.

来源：https://stackoverflow.com/questions/42096271/post-hoc-tests-for-chi-sq-in-r

标签

statistics