I want to substitute NA by 0 in 20 columns. I found this approach for 2 columns, however I guess it's not optimal if the number of columns is 20. Is there any alternative and more compact solution?
mydata[,c("a", "c")] <-
apply(mydata[,c("a","c")], 2, function(x){replace(x, is.na(x), 0)})
UPDATE: For simplicity lets take this data with 8 columns and substitute NAs in columns b, c, e, f and d
a b c d e f g d
1 NA NA 2 3 4 7 6
2 g 3 NA 4 5 4 Y
3 r 4 4 NA t 5 5
The result must be this one:
a b c d e f g d
1 0 0 2 3 4 7 6
2 g 3 NA 4 5 4 Y
3 r 4 4 0 t 5 5
We can use NAer
from qdap
to convert the NA to 0. If there are multiple column, loop using lapply
.
library(qdap)
nm1 <- c('b', 'c', 'e', 'f')
mydata[nm1] <- lapply(mydata[nm1], NAer)
mydata
# a b c d e f g d.1
#1 1 0 0 2 3 4 7 6
#2 2 g 3 NA 4 5 4 Y
#3 3 r 4 4 0 t 5 5
Or using dplyr
library(dplyr)
mydata %>%
mutate_each_(funs(replace(., which(is.na(.)), 0)), nm1)
# a b c d e f g d.1
#1 1 0 0 2 3 4 7 6
#2 2 g 3 NA 4 5 4 Y
#3 3 r 4 4 0 t 5 5
The replace_na
function from tidyr
can be applied over a vector as well as a dataframe (http://tidyr.tidyverse.org/reference/replace_na.html).
Use it with a mutate_at
variation from dplyr
to apply it to multiple columns at the same time:
my_data %>% mutate_at(vars(b,c,e,f), replace_na, 0)
or
my_data %>% mutate_at(c('b','c','e','f'), replace_na, 0)
Another option:
library(tidyr)
v <- c('b', 'c', 'e', 'f')
replace_na(df, as.list(setNames(rep(0, length(v)), v)))
Which gives:
# a b c d e f g d.1
#1 1 0 0 2 3 4 7 6
#2 2 g 3 NA 4 5 4 Y
#3 3 r 4 4 0 t 5 5
来源:https://stackoverflow.com/questions/33067547/how-to-substitute-na-by-0-in-20-columns