Accessing fitted.values when using ddply

社会主义新天地 提交于 2019-12-13 18:03:53

问题


I am using ddply to execute glm on subsets of my data. I am having difficulty accessing the estimated Y values. I am able to get the model parameter estimates using the below code, but all the variations I've tried to get the fitted values have fallen short. The dependent and independent variables in the glm model are column vectors, as is the "Dmsa" variable used in the ddply operation.

Define the model:

Model <- function(df){coef(glm(Y~D+O+B+A+log(M), family=poisson(link="log"), data=df))}

Execute the model on subsets:

Modrpt <- ddply(msadata, "Dmsa", Model)

Print Modrpt gives the model coefficients, but no Y estimates.

I know that if I wasn't using ddply, I can access the glm estimated Y values by using the code:

Model <- glm(Y~D+O+B+A+log(M), family=poisson(link="log"), data=msadata)

fits <- Model$fitted.values

I have tried both of the following to get the fitted values for the subsets, but no luck:

fits <- fitted.values(ddply(msadata, "Dmsa", Model))

fits <- ddply(msadata, "Dmsa", fitted.values(Model))

I'm sure this is a very easy to code...unfortunately, I'm just learning R. Does anyone know where I am going wrong?


回答1:


You can use an anonymous function in your call to ddply e.g.

require(plyr)
data(iris)
model <- function(df){
    lm( Petal.Length ~ Sepal.Length + Sepal.Width , data = df )
    }

ddply( iris , "Species" , function(x) fitted.values( model(x) ) ) 

This has the advantage that you can also, without rewriting your model function, get thecoef values by doing

    ddply( iris , "Species" , function(x) coef( model(x) ) ) 

As @James points out, this will fall down if you have splits of unequal size, better to use dlply which puts the result of each subset in it's own list element.

(I make no claims for statistical relevance or correctness of the example model - it is just an example)




回答2:


I'd recommending doing this in two steps:

library(plyr)

# First first the models
models <- dlply(iris, "Species", lm, 
  formula = Petal.Length ~ Sepal.Length + Sepal.Width )

# Next, extract the fitted values
ldply(models, fitted.values)

# Or maybe
ldply(models, as.data.frame(fitted.values))


来源:https://stackoverflow.com/questions/18082160/accessing-fitted-values-when-using-ddply

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!