What are the standard panel data model selections and steps?

不问归期 提交于 2021-01-05 07:21:26

问题


I have got a panel data in R

library(AER)
data(Fatalities)
# define the fatality rate
Fatalities$fatal_rate <- Fatalities$fatal / Fatalities$pop * 10000
# mandadory jail or community service?
Fatalities$punish <- with(Fatalities, factor(jail == "yes" | service == "yes", labels = c("no", "yes")))

I am observing beertax’s effect to their fatal_rate from 1982-1988 within 48 states. Based on the data’s nature, (observing same sample across different years), I am thinking of fixed effect model, and in order to prove that it is the right fit, I first did Exploratory Data Analysis:

coplot(fatal_rate ~  year|state, type="b", data=Fatalities)

scatterplot(fatal_rate ~  year|state, data=Fatalities)

Also the Heterogeniety across state and year separately. The plots shows heterogeneity—can I have my conclusion that I should have fixed effects with entity and time?

plotmeans(fatal_rate ~  state, data=Fatalities)

plotmeans(fatal_rate ~  year, data=Fatalities)

Judging from the plots, I think that I should have fixed entity and fixed time in fixed effect model. However in order to prove it statistically, I did the following tests:

  1. I check whether there is panel effect in the data (so panel regression or normal OLS would be suitable for my data)

  1. After I got the result: p-value small so there is panel effect in the data. Then, I use plmtest to test if I should have added time effect as well.

To my surprise, the result showed that I don’t need to add time effect. coz based on the plots, I should have time effect?

  1. The next step should be using Hausman test to compare fixed model with random model (here I have time effect added)
fixed <- plm(fatal_rate ~ beertax + drinkage + punish + miles + unemp + log(income), 
           index = c("state", "year"),model = "within",effect = "twoways",data = Fatalities)
random <- plm(fatal_rate ~ beertax + drinkage + punish + miles + unemp + log(income), 
                       index = c("state", "year"),model = "random",data = Fatalities)
Hausman test
phtest(fixed, random)

The p value is less than 0.05, therefore I can draw conclusion that fixed effect is better.

I am wondering if I have the right steps for choosing the right model for my data, and am I reading the plot right?—Can I draw the conclusion based on the plots that I should have both entity and time fixed effects? why is the “plmtest(fixed, c("time"), type=("bp"))” showing me a different result? Do I need to have test on “between” and “within” estimator fixed effects, if yes, which test I should carry out?

来源:https://stackoverflow.com/questions/65446646/what-are-the-standard-panel-data-model-selections-and-steps

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!