(I\'m using R.) For a list of words that\'s called \"goodwords.corpus\", I am looping through the documents in a corpus, and replacing each of the words on the list \"goodwo
Use \b
to indicate a word boundary:
> text <- "good night goodnight"
> gsub("\\bgood\\b", paste("good", 1234), text)
[1] "good 1234 night goodnight"
In your loop, something like this:
for (word in goodwords.corpus){
patt <- paste0('\\b', word, '\\b')
repl <- paste(word, "1234")
test <-gsub(patt, repl, test)
}
You are so close to getting this. You're already using paste
to form the replacement string, why not use it to form the pattern string?
goodwords.corpus <- c("good")
test <- "I am having a good time goodnight"
for (i in 1:length(goodwords.corpus)){
test <-gsub(paste0('\\<', goodwords.corpus[[i]], '\\>'), paste(goodwords.corpus[[i]], "1234"), test)
}
test
# [1] "I am having a good 1234 time goodnight"
(paste0
is merely paste(..., sep='')
.)
(I posted this the same time as @MatthewLundberg, and his is also correct. I'm actually more familiar with using \b
vice \<
, but I thought I'd continue with using your code.)