splitting string expression at multiple delimiters in R

我与影子孤独终老i 提交于 2019-12-02 07:59:11

There's probably a cleaner way of doing this, but does this cover your use case(s)?

eqn = "3 + 2*(x1+x2-3*x3 - x1/x3) - 5"

vars = unlist(strsplit(eqn, split="[-+*/)( ]|[^x][0-9]+|^[0-9]+"))
vars = vars[nchar(vars)>0]  # To remove empty strings

vars
[1] "x1" "x2" "x3" "x1" "x3"

If you only want each unique value to show up once, you can do:

vars = unlist(strsplit(eqn, split="[-+*/)( ]|[^x][0-9]+|^[0-9]+"))
vars = unique(vars[nchar(vars)>0])

vars
[1] "x1" "x2" "x3"
MrFlick

Rather than using regular expressions, you could use the R parser to find particular symbols in your expression. If I recycle the find_vars() function form this answer. You could do

extract_vars <- function(x) {
    find_vars(parse(text=x)[[1]])$found
}
expr <- "2*(x1+x2-3*x3)"
extract_vars(expr)
# [1] "x1" "x2" "x3"

Of course this method assumes that all the math expressions that your users enter would also be syntactically-valid R code.

DatamineR

More generally you can use this regex: "([A-z]\d)"

library(stringr)
f <- "2*(x1+x2-3*x3)"
pattern <- "([A-z]\\d)"
str_extract_all(f, pattern)
[[1]]
[1] "x1" "x2" "x3"

More generally use this pattern (as its symbolic math you may have other variables): "([A-z]\d)"

library(stringr)
# A little different example
var <- "2x1*(x1+x2-3*x3)*y1"
pattern <- "([A-z]\\d)"
str_extract_all(var,pattern)  
[[1]]
[1] "x1" "x1" "x2" "x3" "y1"
标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!