问题
I'm trying to use lm() and matchit() on a subset of covariates. I have generated an arbitrary number of columns with prefix "covar", i.e. "covar.1", "covar.2", etc. I'd like to do something like
lm(group ~ covars, data=df)
where covars is a vector of strings c("covar.1", "covar.2", ...).
I tried several things like
cols <- colnames(df)
covars <- cols[grep("covar", colnames(df))]
m.out <- matchit(group ~ covars, data=df, method="nearest", distance="logit", caliper=.20)
but got variable lengths differ (found for 'covars').
Defining a new dataframe only with covars and group can work but that defeats my purpose using matchit because I want the matched data to have other columns, too, not just covars I picked to be the matched on.
This seems to be an easy task but somehow I can't figure out after some googling. Not sure what R formula expects there as subset of columns. Any help is appreciated.
回答1:
You might want to use as.formula.
Try doing this:
Replace group ~ covars
with as.formula(paste('group','~', paste(covars, collapse="+"))))
回答2:
I mentioned this in your other question, but the cobalt package has a function specifically for this, which is f.build(). The first argument to f.build() is a string containing the name of the treatment variable (or left hand side of the formula), and the second argument is a string vector containing the names of the variables to be on the right hand side of the formula (i.e., the covariates). The second argument can also be a data.frame containing the covariates; f.build() simply extracts the names. It then performs the operation described in the chosen answer, bit adds in a few other aspects that make it a little more general and robust to errors.
The cobalt documentation has a section on f.build() and uses its use with glm() and matchit() as examples.
After running matchit(), you can assess balance on the covariates using the bal.tab() function in cobalt, which is compatible with MatchIt:
bal.tab(m.out, un = TRUE)
The documentation for cobalt explains its use with MatchIt in detail.
来源:https://stackoverflow.com/questions/53854697/fit-model-on-a-subset-of-columns-in-dataframe-in-r