问题
x <- seq(0.1,10,0.1)
y <- if (x < 5) 1 else 2
I would want the if to operate on every single case instead of operating on the whole vector.
What do I have to change?
回答1:
x <- seq(0.1,10,0.1)
> x
[1] 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 1.1 1.2 1.3 1.4 1.5
[16] 1.6 1.7 1.8 1.9 2.0 2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 2.9 3.0
[31] 3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8 3.9 4.0 4.1 4.2 4.3 4.4 4.5
[46] 4.6 4.7 4.8 4.9 5.0 5.1 5.2 5.3 5.4 5.5 5.6 5.7 5.8 5.9 6.0
[61] 6.1 6.2 6.3 6.4 6.5 6.6 6.7 6.8 6.9 7.0 7.1 7.2 7.3 7.4 7.5
[76] 7.6 7.7 7.8 7.9 8.0 8.1 8.2 8.3 8.4 8.5 8.6 8.7 8.8 8.9 9.0
[91] 9.1 9.2 9.3 9.4 9.5 9.6 9.7 9.8 9.9 10.0
> ifelse(x < 5, 1, 2)
[1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
[38] 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
[75] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
回答2:
For completeness: In big vectors, you can use the indices to speed things up (we do that often in simulations, where functions typically run 1000 to 10000 times). But as long as it isn't necessary, just use ifelse. This reads a lot easier.
> set.seed(100)
> x <- runif(1000,1,10)
> system.time(replicate(10000,{
+ y <- ifelse(x < 5,1,2)
+ }))
user system elapsed
2.56 0.08 2.64
> system.time(replicate(10000,{
+ y <- rep(2,length(x))
+ y[x < 5]<- 1
+ }))
user system elapsed
0.48 0.00 0.48
回答3:
y <- if (x < 5) 1 else 2 does not operate on the whole vector (the warning you receive tells you only the first element of the condition will be used). You want ifelse:
y <- ifelse(x < 5, 1, 2)
ifelse operates on the whole logical vector, element-by-element. if only accepts one logical value. See ?"if" and ?ifelse
回答4:
You could also just create a logical vector and 1 to it
x <- seq(0.1, 10, 0.1) # Your data set
(x >= 5) + 1
# [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
# [92] 2 2 2 2 2 2 2 2 2
If would like to compare performance, it would be the fastest solution
set.seed(100)
x <- runif(1e6, 1, 10)
RL <- function(x) y <- ifelse(x < 5,1,2)
JM <- function(x) {y <- rep(2, length(x)); y[x < 5] <- 1}
DA <- function(x) y <- (x >= 5) + 1
library(microbenchmark)
microbenchmark(RL(x),
JM(x),
DA(x))
# Unit: milliseconds
# expr min lq mean median uq max neval
# RL(x) 331.83448 366.52940 378.89182 374.99741 381.08659 609.21218 100
# JM(x) 38.72894 42.18745 44.36493 43.25086 44.09626 82.76168 100
# DA(x) 10.01644 11.96482 14.21593 13.17825 14.12930 53.76923 100
回答5:
Following the above post you can even use and modify the elements of a vector satisfying the criteria. In my opinion if it's not more costly to compute faster one should always do it.
x = seq(0.1,10,0.1)
y <- rep(2,length(x))
y[x<5] <- x[x<5]*2
The code of the previous post is best to answer the question. But if I had to use the code above I would do:
x = seq(0.1,10,0.1)
y <- rep(2,length(x))
y[x<5] <- x[x<5]*0 +1
回答6:
nzMean <- function(x) { mean(x[x!=-1],na.rm=TRUE)}
nzMin <- function(x) {min(x[x!=-1],na.rm=TRUE)}
nzMax <- function(x) { max(x[x!=-1],na.rm=TRUE)}
nzRange<-function(x) {nzMax(x)-nzMin(x)}
nzSD <- function(x) { SD(x[x!=-1],na.rm=TRUE)}
#following function works
nzN1<- function(x) {ifelse(x!=-1,(x-nzMin(x))/nzRange(x) ,x) }
#following is bad as it returns only 4 not 5 elements of vector
nzN2<- function(x) {ifelse(x!=-1,(x[x!=-1]-nzMin(x))/nzRange(x) ,x) }
#following is bad as it returns 5 elements of vector but not correct answer
nzN3<- function(x) {ifelse(x!=-1,(x[x!=-1]-nzMin(x))/nzRange(x) ,-1) }
y<-c(1,-1,-20,2,4)
a<-nzMean(y)
b<-nzMin(y)
c<-nzMax(y)
d<-nzRange(y)
# test the working function
z<-nzN1(y)
print(z)
来源:https://stackoverflow.com/questions/4042413/vectorized-if-statement-in-r