Is there a better alternative than string manipulation to programmatically build formulas?

我是研究僧i 提交于 2019-11-26 07:57:13

问题


Everyone else\'s functions seem to take formula objects and then do dark magic to them somewhere deep inside and I\'m jealous.

I\'m writing a function that fits multiple models. Parts of the formulas for these models remain the same and part change from one model to the next. The clumsy way would be to have the user input the formula parts as character strings, do some character manipulation on them, and then use as.formula.

But before I go that route, I just want to make sure that I\'m not overlooking some cleaner way of doing it that would allow the function to accept formulas in the standard R format (e.g. extracted from other formula-using objects).

I want something like...

> LHS <- y~1; RHS <- ~a+b; c(LHS,RHS);
y ~ a + b
> RHS2 <- ~c;
> c(LHS, RHS, RHS2);
y ~ a + b + c

or...

> LHS + RHS;
y ~ a + b
> LHS + RHS + RHS2;
y ~ a + b + c

...but unfortunately neither syntax works. Does anybody know if there is something that does? Thanks.


回答1:


reformulate will do what you want.

reformulate(termlabels = c('x','z'), response = 'y')
## y ~ x + z

Or without an intercept

reformulate(termlabels = c('x','z'), response = 'y', intercept = FALSE)
## y ~ x + z - 1

Note that you cannot construct formulae with multiple reponses such as x+y ~z+b

reformulate(termlabels = c('x','y'), response = c('z','b'))
z ~ x + y

To extract the terms from an existing formula (given your example)

attr(terms(RHS), 'term.labels')
## [1] "a" "b"

To get the response is slightly different, a simple approach (for a single variable response).

as.character(LHS)[2]
## [1] 'y'


combine_formula <- function(LHS, RHS){
  .terms <- lapply(RHS, terms)
  new_terms <- unique(unlist(lapply(.terms, attr, which = 'term.labels')))
  response <- as.character(LHS)[2]

  reformulate(new_terms, response)


}


combine_formula(LHS, list(RHS, RHS2))

## y ~ a + b + c
## <environment: 0x577fb908>

I think it would be more sensible to specify the response as a character vector, something like

combine_formula2 <- function(response, RHS, intercept = TRUE){
  .terms <- lapply(RHS, terms)
  new_terms <- unique(unlist(lapply(.terms, attr, which = 'term.labels')))
  response <- as.character(LHS)[2]

  reformulate(new_terms, response, intercept)


}
combine_formula2('y', list(RHS, RHS2))

you could also define a + operator to work with formulae (update setting an new method for formula objects)

`+.formula` <- function(e1,e2){
  .terms <- lapply(c(e1,e2), terms)
  reformulate(unique(unlist(lapply(.terms, attr, which = 'term.labels'))))
}

RHS + RHS2
## ~a + b + c

You can also use update.formula using . judiciously

 update(~a+b, y ~ .)
 ##  y~a+b


来源:https://stackoverflow.com/questions/12967797/is-there-a-better-alternative-than-string-manipulation-to-programmatically-build

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!