I'm trying to replace characters in a data.frame. I have a solution for this
> df <- data.frame(var1 = c("aabbcdefg", "aabbcdefg"))
> df
var1
1 aabbcdefg
2 aabbcdefg
> df$var1 <- gsub("a", "h", df$var1)
> df$var1 <- gsub("b", "i", df$var1)
> df$var1 <- gsub("c", "j", df$var1)
> df$var1 <- gsub("d", "k", df$var1)
> df$var1 <- gsub("e", "l", df$var1)
> df$var1 <- gsub("f", "m", df$var1)
> df
var1
1 hhiijklmg
2 hhiijklmg
>
but I would like to avoid using several gsub calls, it would be much nicer to produce a function to do this at once?
Any ideas ho to proceed?
You can create from
and to
vectors:
from <- c('a','b','c','d','e','f')
to <- c('h','i','j','k','l','m')
and then vectorialize the gsub
function:
gsub2 <- function(pattern, replacement, x, ...) {
for(i in 1:length(pattern))
x <- gsub(pattern[i], replacement[i], x, ...)
x
}
Which gives:
> df <- data.frame(var1 = c("aabbcdefg", "aabbcdefg"))
> df$var1 <- gsub2(from, to, df$var1)
> df
var1
1 hhiijklmg
2 hhiijklmg
You want chartr
:
df$var1 <- chartr("abcdef", "hijklm", df$var1)
df
# var1
# 1 hhiijklmg
# 2 hhiijklmg
If you don't want to use chartr because the substitutions may be more than one character, then another option is to use gsubfn from the gsubfn package (I know this is not gsub, but is an expansion on gsub). Here is one example:
> library(gsubfn)
> tmp <- list(a='apple',b='banana',c='cherry')
> gsubfn('.', tmp, 'a.b.c.d')
[1] "apple.banana.cherry.d"
The replacement can also be a function that would take the match and return the replacement value for that match.
来源:https://stackoverflow.com/questions/6954017/replace-characters-using-gsub-how-to-create-a-function