obtaining average coefficients and adj. R^2 from multiple pooled regressions using lapply

问题

I have performed multiple Pooled regressions with a loop function and stored the regression output in a list (myregression). What i would like to do now is to efficiently perform the coeftest function in the lmtest package over all my regressions (i.e. myregression list) to adjust standard errors and t-statistics. Finally i would like to obtain the mean of the coefficients, standard errors and t-values.

Here is what i came up so far:

library(plm)
data("Grunfeld", package="plm")

# Store each subset regression in myregression
myregression <- list()

count <- 1

# Regression on six-year subsets of Grunfeld
for(t in 1940:1950){

  myregression[[count]] <- plm(inv ~ value + capital, 
                              subset(Grunfeld, year<=t & year>=t-5),
                              index=c("firm","year"))

# Name each regression based on the year range included in the data subset
names(myregression)[[count]] = paste0("Year_",t)
count <- count+1
}

Here is where my problems kick in: Although i'm able to perform the coeftest function to invidiual components of the list, I'm unable to code the lapply function accordingly.

## Apply coeftest function to all plm-objects
library(lmtest)

coeftest(myregression$Year_1940, vcov=function(x) vcovSCC(x, type="HC3", maxlag=4))
coeftest(myregression$Year_1941, vcov=function(x) vcovSCC(x, type="HC3", maxlag=4))

COEFTEST<-lapply(myregression, coeftest(x, vcov=function(x) vcovSCC(x, type="HC3", maxlag=4)))

## obtaining average coefficients, se's,and t values over all regressions
lapply(COEFTEST, mean)

I hope there is only a minor mistake that I'm unable to see. I further noticed that the plm regression output is smaller than regular lm output is there another way to obtain mean adj. R^2?

回答1:

Try

 COEFTEST<-lapply(myregression, function(x) coeftest(x, vcov=vcovSCC(x, type="HC3", maxlag=4)))

Thsi will give you a list of coeftest-outputs for each regression, which you can then use in whatever way you want.

As a sidenote, make sure that whatever you do with this output makes sense. Taking the mean of an coeftest-output is not obvously sensible to me. If you want to have the average over all coefficients, try something like

 mean(sapply(COEFTEST, function(x)  x["capital", "Estimate"]))

Here, sapply retrieves all estimates for the variable capital from the COEFTEST outputs and puts them into a vector.

To access other elements, it is helpful to look at the structure of the objects using str(). For instance, str(summary(myregression[[1]])) reveals that the r-squares are saved under the name r.squared. These you can access using e.g. summary(myregression[[1]])$r.squared for the first regression output. Automating this, you can again use a construct as above, e.g.

 sapply(myregression, function(x) summary(x)$r.squared)

回答2:

There is a nicer way to obtain average coefficients (technically known as Mean Group estimators) using pmg() in plm. And from the looks of it, you are trying to estimate Fama-MacBeth regressions and SEs.

require(foreign)
require(plm)
require(lmtest)
test <- read.dta("http://www.kellogg.northwestern.edu/faculty/petersen/htm/papers/se/test_data.dta")
fpmg <- pmg(y~x, test, index=c("year","firmid")) ##Fama-MacBeth

> ##Fama-MacBeth
> coeftest(fpmg)

t test of coefficients:

            Estimate Std. Error t value Pr(>|t|)    
(Intercept) 0.031278   0.023356  1.3392   0.1806    
x           1.035586   0.033342 31.0599   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

For further details see Fama-MacBeth and Cluster-Robust (by Firm and Time) Standard Errors in R.