glmulti Oversized candidate set

时光毁灭记忆、已成空白 提交于 2019-12-01 04:33:50

I have encountered the same problem, here is what I have found out so far:

  1. The number of rows does not seem to be the issue. The issue is that with 150 predictors the package can't handle an exhaustive search (that is take a look and compare all possible models). From my experience your specific error message "Oversized Candidate Set", is triggered by the fact that you also allow for pairwise interactions (level=2, set level=1 to prohibit interactions). Then you will most likely run into a warning message "Too many predictors". In my (very limited) experimentation, I found that the maximum amount of models I got to work into the candidate set was about a billion models (specifically: 30 covariates equal 1,073,741,824 based on the 2^n to calculate possible combinations (n=30).). Here is the code I used to evaluate this

    out <integer(50) for(i in 2:40) out[i]<-glmulti(names(data)[1], names(data)[2:i], method="d", level=1, crit=aic, data=data)

    once the loop hits 31 covariates the candidate set returns with 0 models. 33 and later it starts returning the warning message. My "data" had about 100 variables and just around a 1000 rows, but like I said the problem is the width of the dataset not the depth.

  2. Like I said, start by eliminating the interactions, then consider using other variable reduction techniques first to get your variable number down (factor analysis/principle components or clustering). The issue with those is will lose some explainability, but keep predictive power.

  3. The glmuttil documentation compares the package with alternatives, while highlighting their use cases, benefits and downfalls.

PS: I ran my stuff on Win7, 64 bit, 16GB Ram, R version: 3.10 glmutil 1.07. PPS: The author of the package was said to release version 2.0 last year that would fix some of these issues. Read more at the source

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!