问题
I'm looking to split a string of a generic form, where the square brackets denote the "sections" of the string. Ex:
x <- "[a] + [bc] + 1"
And return a character vector that looks like:
"[a]" " + " "[bc]" " + 1"
EDIT: Ended up using this:
x <- "[a] + [bc] + 1"
x <- gsub("\\[",",[",x)
x <- gsub("\\]","],",x)
strsplit(x,",")
回答1:
I've seen TylerRinker's code and suspect it may be more clear than this but this may serve as way to learn a different set of functions. (I liked his better before I noticed that it split on spaces.) I tried adapting this to work with strsplit
but that function always removes the separators.
Maybe this could be adapted to make a newstrsplit
that splits at the separators but leaves them in? Probably need to not split at first or last position and distinguish between opening and closing separators.
scan(text= # use scan to separate after insertion of commas
gsub("\\]", "],", # put commas in after "]"'s
gsub(".\\[", ",[", x)) , # add commas before "[" unless at first position
what="", sep=",") # tell scan this character argument and separators are ","
#Read 4 items
#[1] "[a]" " +" "[bc]" " + 1"
回答2:
This is one lazy approach:
FUN <- function(x) {
all <- unlist(strsplit(x, "\\s+"))
last <- paste(c(" ", tail(all, 2)), collapse="")
c(head(all, -2), last)
}
x <- "[a] + [bc] + 1"
FUN(x)
## > FUN(x)
## [1] "[a]" "+" "[bc]" " +1"
回答3:
You can compute the split points manually and use substring
:
split.pos <- gregexpr('\\[.*?]',x)[[1]]
split.length <- attr(split.pos, "match.length")
split.start <- sort(c(split.pos, split.pos+split.length))
split.end <- c(split.start[-1]-1, nchar(x))
substring(x,split.start,split.end)
# [1] "[a]" " + " "[bc]" " + 1"
回答4:
And here's a version that splits on the brackets AND keeps them in the result, using positive lookahead and lookbehind:
splitme <- function(x) {
x <- unlist(strsplit(x, "(?=\\[)", perl=TRUE))
x <- unlist(strsplit(x, "(?<=\\])", perl=TRUE))
for (i in which(x=="[")) {
x[i+1] <- paste(x[i], x[i+1], sep="")
}
x[-which(x=="[")]
}
splitme(x)
#[1] "[a]" " + " "[bc]" " + 1"
来源:https://stackoverflow.com/questions/15573887/split-string-with-regex