How to change na.action for zero-inflated regression model?

自古美人都是妖i 提交于 2021-02-08 07:37:50

问题


I am running a zero-inflated negative binomial regression model using the function zeroinfl from the pscl package.

I need to exclude NA's from the model in order to be able to plot the residuals against the dependent variable later in the analysis.

Therefore, I want to set na.action="na.exclude". I can do this without any problem for a non-zero-inflated negative binomial regression model (using glm.nb from the glm package), eg.

fm_nbin <- glm.nb(DV ~ factor(IDV) + contr1
               +contr2 + contr3, data=df, 
               subset=(df$var<500), na.action="na.exclude")
fm_nbin.res = resid(fm_nbin) 
plot(fm_nbin.res~df$var)  

works fine. However, when I do the same for a zero-inflated model, it does not work:

zinfl <- zeroinfl(DV ~ factor(IDV) + contr1
               +contr2 + contr3 | factor(IDV) + contr1
               +contr2 + contr3, data=df, 
               subset=(df$var<500), na.action="na.exclude")
zinfl.res = resid(zinfl) 
plot(zinfl.res~df$var)

gives the error

Error in function (formula, data = NULL, subset = NULL, na.action = na.fail,  : 
  variable lengths differ (found for 'df$var')

Is there any other command I should use to exclude NA's from my regression?

Edit: This is the nearest of an answer I could find. Can it in some way be applied to my problem? Also, can naresid in some way be applied?


回答1:


As one finds by following the trail of documentation from zeroinfl to glm.fit: "The ‘factory-fresh’ default is na.omit." Notice that I have not put quotes around it since it is supposed to be a function rather but the function will accept it as a name so it doesn't matter if it is quoted. I will admit that I don't really know how na.omit and na.exclude really differ (something to do with residuals I read), but would definitely go with the default setting first, since it generally delivers what I want from regression functions. So try just leaving it out:

zinfl <- zeroinfl(DV ~ factor(IDV) + contr1
           +contr2 + contr3 | factor(IDV) + contr1
           +contr2 + contr3, data=df, 
           subset=(df$var<500) )



回答2:


Since both the option of using na.omit(df) or na.action="na.exclude" don't seem to work in a zeroinfl regression model, I found another (indirect) way of achieving that NA's are excluded in the regression.

First, since my original dataset contains far more variables than only the regressors and outcome variable, I created a new dataset including only the variables I use in the regression model; and also set a condition on the value of var to include observations in the regression:

df1 <- subset(df, var<500, select=c("DV", "IDV", "contr1", "contr2", "contr3"))
df1 <- na.omit(df1)

I then run the same code as above using the new dataset df1, which works perfectly:

zinfl <- zeroinfl(DV ~ factor(IDV) + contr1
           +contr2 + contr3 | factor(IDV) + contr1
           +contr2 + contr3, data=df1)
zinfl.res = resid(zinfl) 
plot(zinfl.res~df1$DV)


来源:https://stackoverflow.com/questions/16376544/how-to-change-na-action-for-zero-inflated-regression-model

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!