问题
I am trying to generate a random sample that excludes certain "bad data." I do not know whether the data is "bad" until after I sample it. Thus, I need to make a random draw from the population and then test it. If the data is "good" then keep it. If the data is "bad" then randomly draw another and test it. I would like to do this until my sample size reaches 25. Below is a simplified example of my attempt to write a function that does this. Can anyone please tell me what I am missing?
df <- data.frame(NAME=c(rep('Frank',10),rep('Mary',10)), SCORE=rnorm(20))
df
random.sample <- function(x) {
x <- df[sample(nrow(df), 1), ]
if (x$SCORE > 0) return(x)
#if (x$SCORE <= 0) run the function again
}
random.sample(df)
回答1:
Here is a general use of a while
loop:
random.sample <- function(x) {
success <- FALSE
while (!success) {
# do something
i <- sample(nrow(df), 1)
x <- df[sample(nrow(df), 1), ]
# check for success
success <- x$SCORE > 0
}
return(x)
}
An alternative is to use repeat
(syntactic sugar for while(TRUE)
) and break
:
random.sample <- function(x) {
repeat {
# do something
i <- sample(nrow(df), 1)
x <- df[sample(nrow(df), 1), ]
# exit if the condition is met
if (x$SCORE > 0) break
}
return(x)
}
where break
makes you exit the repeat
block. Alternatively, you could have if (x$SCORE > 0) return(x)
to exit the function directly.
回答2:
use this after your first sample
while (any(bad <- (x$SCORE <= 0)))
x[bad, ] <- df[sample(nrow(df), sum(bad)), ]
回答3:
random.sample <- function(x) {
x <- df[sample(nrow(df), 1), ]
if (x$SCORE > 0) return(x)
Recall(x)# run the function again
}
random.sample(df)
# NAME SCORE
#14 Mary 1.252566
It seems to me that this should work as well:
df$SCORE[ df$SCORE > 0 ][ sample(1:sum(df$SCORE > 0), 1) ]
#[1] 0.6579631
回答4:
You can just select the rows to sample directly like so (just 5):
> df <- data.frame(NAME=c(rep('Frank',10),rep('Mary',10)), SCORE=rnorm(20))
> df[sample(which(df$SCORE>0), 5),]
NAME SCORE
14 Mary 1.0858854
10 Frank 0.7037989
16 Mary 0.7688913
5 Frank 0.2067499
17 Mary 0.4391216
this is without replacement, for bootstrap put in replace=T
.
来源:https://stackoverflow.com/questions/20507247/r-repeat-function-until-condition-met