Error when estimating random effects model with plm package when haven is loaded

自作多情 提交于 2020-05-16 07:38:45

问题


I have a weird problem when estimating a random effects with the plm package in R.

Here is a link to a dput of part of my data: https://pastebin.com/raw/mTdh26dg

My code is:

library(plm)
library(haven)
pmales <- pdata.frame(males_part, index = c("NR", "YEAR"))
random <- plm(WAGE ~ SCHOOL + EXPER + EXPER2 + BLACK + HISP + MAR + UNION + RUR + NE + NC + S + factor(YEAR), 
              data = pmales, model = "random")

The reason I included libary(haven) is that my original data set is a .dta file.

When I run this code I get this error:

Error in is.pbalanced.default(x) : 
  argument "y" is missing, with no default

The weird thing is that if I start with a clean R session and don't load haven (and the import the data from the dput), I don't get this error. I do get the error if I import from the dput but load haven anyway. I also don't get the error when estimating within or pooling models (even with haven loaded).

Here is my sessionInfo():

R version 3.6.3 (2020-02-29)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Linux Mint 19.3

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.7.1
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.7.1

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8     LC_MONETARY=nl_NL.UTF-8   
 [6] LC_MESSAGES=en_US.UTF-8    LC_PAPER=nl_NL.UTF-8       LC_NAME=C                  LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=nl_NL.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] haven_2.2.0 plm_2.2-3  

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.4.6     rstudioapi_0.11  Formula_1.2-3    magrittr_1.5     hms_0.5.3        MASS_7.3-51.5    lattice_0.20-41  rlang_0.4.5     
 [9] bibtex_0.4.2.2   fansi_0.4.1      stringr_1.4.0    tools_3.6.3      grid_3.6.3       nlme_3.1-144     cli_2.0.2        ellipsis_0.3.0  
[17] maxLik_1.3-8     miscTools_0.6-26 assertthat_0.2.1 lmtest_0.9-37    digest_0.6.25    lifecycle_0.2.0  tibble_3.0.0     crayon_1.3.4    
[25] bdsmatrix_1.3-4  vctrs_0.2.4      Rdpack_0.11-1    gbRd_0.4-11      glue_1.4.0       sandwich_2.5-1   stringi_1.4.6    pillar_1.4.3    
[33] compiler_3.6.3   forcats_0.5.0    pkgconfig_2.0.3  zoo_1.8-7       

Is this a bug in plm or haven? Or some sort of incompatibility of the two (or their dependencies)?


回答1:


I think the issue is that your data males_part is a tibble, but you don't have the tibble package loaded until you attach haven. If you don't have tibble loaded, then you won't have any methods for the tibble classes "tbl_df" and "tbl", and it will act exactly like a data frame. Once tibble is loaded, it will start to act like a tibble.

This is an issue because tibbles and data frames aren't identical, but the class of a tibble includes "data.frame". I'd guess what's happening is that plm assumes that extracting a single column from a data frame gives a vector, but with a tibble, it gives another tibble.

The workaround for you is pretty simple. Just use males_part <- as.data.frame(males_part) to remove the tibble class, and then haven won't matter.

Conceivably this is worth reporting to the maintainer of plm. It's a design flaw in tibble that is causing the problem (if tibbles inherit from data.frame, they should act like data frames), but tibbles are pretty common nowadays, and that design is unlikely to change. The plm function could protect itself against this by putting data <- as.data.frame(data) early in the pdata.frame function, or protecting every column extraction with drop = TRUE.



来源:https://stackoverflow.com/questions/61249692/error-when-estimating-random-effects-model-with-plm-package-when-haven-is-loaded

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!