问题
I am using the R package nparcomp
recently and I used it to test the significant difference of my response variable between the categories.
I found out that the nparcomp
function can not deal with large size of data (number of rows>5000). For example, here is my code:
a<-nparcomp(oc20_kgm2~ decade, data=dat, asy.method = "mult.t",
type = "Tukey",alternative = "two.sided",
plot.simci = TRUE, info = FALSE)
summary(a)
where, oc20_kgm2
is my response variable, decade
is my factor (with 10 categories), dat
is my dataset. My original dataset has about 15,000 rows/samples. If I run the code above, the error showed:
Error in checkmvArgs(lower = lower, upper = upper, mean = delta, corr = corr, :
‘lower’ not specified or contains NA
In addition: There were 49 warnings (use warnings() to see them)
So to diagnose, I have to randomly select 5,000 samples from my original dat
. And then I run the same code above, it works. In addition, 5,500 samples or 10,000 samples don't work.
My question is, is there a limitation of sample size to run this function? And is there any other test function/package that I can use in R?
Revision after reading the comment:
traceback()
4: stop(sQuote("lower"), " not specified or contains NA")
3: checkmvArgs(lower = lower, upper = upper, mean = delta, corr = corr,
sigma = sigma)
2: pmvt(lower = -abs(T[pp]), abs(T[pp]), corr = rho.bf, df = df.sw,
delta = rep(0, nc))
1: nparcomp(oc20_kgm2 ~ decade, data = dat2, asy.method = "mult.t",
type = "Tukey", alternative = "two.sided", plot.simci = TRUE,
info = FALSE)
> warnings()
Warning messages:
1: In n[j] * n[w] * n[i] : NAs produced by integer overflow
2: In n[i] * n[w] * n[j] : NAs produced by integer overflow
3: In n[i] * n[v] * n[j] : NAs produced by integer overflow
4: In cov2cor(cov.bf) :
diag(.) had 0 or NA entries; non-finite result is doubtful
回答1:
This error occurs because n
, the size of each factor, is a list of integers and therefore vulnerable to integer overflow at large values. To fix it, modify the source code of nparcomp from
n <- sapply(samples, length)
to
n <- as.numeric(sapply(samples, length))
To view the source code, type nparcomp
at an R prompt.
来源:https://stackoverflow.com/questions/24047659/dataset-limitation-in-r-package-nparcomp