Error during wrapup: long vectors not supported yet: in glm() function

倾然丶 夕夏残阳落幕 提交于 2021-02-11 16:42:44

问题


I found several questions on Stackoverflow regarding this topic (some of them without any answer) but nothing related (so far) with this error in regression.

I'm, running a probit model in r with (I'm guessing) too many fixed effects (year and places):

myprobit <- glm(factor(Y) ~ factor(T) + factor(X1) + factor(X2) + factor(X3) +
                 factor(YEAR) + factor(PLACE),
                 family = binomial(link = "probit"),
                 data = DT)

The PLACE variable has about 1000 unique values and YEAR 8 values. The dataset DT has 13,099,225 obs and 79 columns.

The error I got is:

Error: cannot allocate vector of size 59.3 Gb
Error during wrapup: long vectors not supported yet: ../include/Rinlinedfuns.h:519

The machine I'm using has 128 GB of RAM.

So, I don't know what I can do, without change the function. Does anyone know how to deal with this issue? Thanks!


回答1:


In order to close this question, I have to mention that the @Axeman's answer it is the only approach feasible for my problem. The whole issue is, there is not enough memory to manage such a huge design matrix.

Therefore, run a probit regression using the biglm package and bigglm() function is the only solution I found so far.

Nevertheless, I realize, due to how the biglm package works, taking iteratively chunks of the data, the use of factor() variables in the RHS it's problematic every time when factor level is not represented in the chunk. In other words, if a factor variable has 5 levels, but in the data chunk only 4 levels appear, I will have an error in the estimation.

There are several questions and comments about this on Stackoverflow.



来源:https://stackoverflow.com/questions/60419930/error-during-wrapup-long-vectors-not-supported-yet-in-glm-function

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!