问题
Does anyone know why dplyr::case_when()
produces the error in the following code?
tibble(tmp1 = sample(c(T, F), size = 32, replace = T),
tmp2 = sample(c(T, F), size = 32, replace = T),
tmp3 = sample(c(T, F), size = 32, replace = T)) %>%
mutate(tmp = apply(cbind(tmp1, tmp2, tmp3), 1, function(x) {
case_when(
all(x == F) ~ "N",
any(x == T) ~ "Y"
)
}))
Error in mutate_impl(.data, dots) :
Evaluation error: object 'x' not found.
I am using R 3.4.3 with dplyr 0.7.4 on Ubuntu 16.04.
The error message is quite confusing, since the following code works fine, which indicates that x
is not missing:
tibble(tmp1 = sample(c(T, F), size = 32, replace = T),
tmp2 = sample(c(T, F), size = 32, replace = T),
tmp3 = sample(c(T, F), size = 32, replace = T)) %>%
mutate(tmp = apply(cbind(tmp1, tmp2, tmp3), 1, function(x) {
if (all(x == F)) {
"N"
} else if(any(x == T)) {
"Y"
}
}))
Just for reference, the following code also works fine:
cbind(tmp1 = sample(c(T, F), size = 32, replace = T),
tmp2 = sample(c(T, F), size = 32, replace = T),
tmp3 = sample(c(T, F), size = 32, replace = T)) %>%
apply(1, function(x) {
case_when(
all(x == F) ~ "N",
any(x == T) ~ "Y"
)
})
回答1:
The issue is case_when
does not do row-wise operation. However, we can simplify the code by using rowSums
(which conducts row-wise operation) and case_when
.
library(dplyr)
set.seed(151)
tibble(tmp1 = sample(c(T, F), size = 32, replace = T),
tmp2 = sample(c(T, F), size = 32, replace = T),
tmp3 = sample(c(T, F), size = 32, replace = T)) %>%
mutate(tmp = case_when(
rowSums(.) == 0 ~"N",
rowSums(.) > 0 ~"Y"
))
# # A tibble: 32 x 4
# tmp1 tmp2 tmp3 tmp
# <lgl> <lgl> <lgl> <chr>
# 1 TRUE TRUE FALSE Y
# 2 FALSE FALSE TRUE Y
# 3 FALSE FALSE TRUE Y
# 4 FALSE FALSE TRUE Y
# 5 TRUE FALSE FALSE Y
# 6 FALSE FALSE FALSE N
# 7 TRUE FALSE FALSE Y
# 8 FALSE TRUE FALSE Y
# 9 TRUE TRUE FALSE Y
# 10 FALSE FALSE TRUE Y
# # ... with 22 more rows
Or since there are only two conditions, rowSums
with ifelse
should be fine.
set.seed(151)
tibble(tmp1 = sample(c(T, F), size = 32, replace = T),
tmp2 = sample(c(T, F), size = 32, replace = T),
tmp3 = sample(c(T, F), size = 32, replace = T)) %>%
mutate(tmp = ifelse(rowSums(.) == 0, "N", "Y"))
# # A tibble: 32 x 4
# tmp1 tmp2 tmp3 tmp
# <lgl> <lgl> <lgl> <chr>
# 1 TRUE TRUE FALSE Y
# 2 FALSE FALSE TRUE Y
# 3 FALSE FALSE TRUE Y
# 4 FALSE FALSE TRUE Y
# 5 TRUE FALSE FALSE Y
# 6 FALSE FALSE FALSE N
# 7 TRUE FALSE FALSE Y
# 8 FALSE TRUE FALSE Y
# 9 TRUE TRUE FALSE Y
# 10 FALSE FALSE TRUE Y
# # ... with 22 more rows
回答2:
How about using Reduce
and logical OR?
set.seed(151);
tibble(tmp1 = sample(c(T, F), size = 32, replace = T),
tmp2 = sample(c(T, F), size = 32, replace = T),
tmp3 = sample(c(T, F), size = 32, replace = T)) %>%
mutate(tmp = Reduce(`|`, list(tmp1, tmp2, tmp3)))
## A tibble: 32 x 4
# tmp1 tmp2 tmp3 tmp
# <lgl> <lgl> <lgl> <lgl>
# 1 TRUE TRUE FALSE TRUE
# 2 FALSE FALSE TRUE TRUE
# 3 FALSE FALSE TRUE TRUE
# 4 FALSE FALSE TRUE TRUE
# 5 TRUE FALSE FALSE TRUE
# 6 FALSE FALSE FALSE FALSE
# 7 TRUE FALSE FALSE TRUE
# 8 FALSE TRUE FALSE TRUE
# 9 TRUE TRUE FALSE TRUE
#10 FALSE FALSE TRUE TRUE
## ... with 22 more rows
回答3:
As it turns out, this is a bug, probably related to the hybrid evaluator: https://github.com/tidyverse/dplyr/issues/3422
来源:https://stackoverflow.com/questions/49268989/dplyrcase-when-evaluation-error-object-x-not-found