generate random integers between two values with a given probability using R

前端 未结 2 750
不思量自难忘°
不思量自难忘° 2020-12-16 04:56

I have the following four number sets:

A=[1,207];
B=[208,386];
C=[387,486];
D=[487,586].

I need to generate 20000 random numbers between 1

相关标签:
2条回答
  • 2020-12-16 05:24

    You can directly use sample, more specifcally the probs argument. Just divide the probability over all the 586 numbers. Category A get's 0.5/207 weight each, etc.

    A <- 1:207
    B <- 208:386
    C <- 387:486
    D <- 487:586
    L <- sapply(list(A, B, C, D), length)
    
    x <- sample(c(A, B, C, D),
                size = 20000,
                prob = rep(c(1/2, 1/6, 1/6, 1/6) / L, L),
                replace = TRUE)
    
    0 讨论(0)
  • 2020-12-16 05:30

    I would say use the Roulette selection method. I will try to give a brief explanation here. Take a line of say length 1 unit. Now break this in proportion of the probability values. So in our case, first piece will be of 1.2 length and next three pieces will be of 1/6 length. Now sample a number between 0,1 from uniform distribution. As all the number have same probability of occurring, a sampled number belonging to a piece will be equal to length of the piece. Hence which ever piece the number belongs too, sample from that vector. (I will give you the R code below you can run it for a huge number to check if what I am saying is true. I might not be doing a good job of explaining it here.)

    It is called Roulette selection because another analogy for the same situation can be, take a circle and split it into sectors where the angle of each sector is proportional to the probability values. Now sample a number again from uniform distribution and see which sector it falls in and sample from that vector with the same probability

    A <- 1:207
    B <- 208:386
    C <- 387:486
    D <- 487:586
    
    cumList <- list(A,B,C,D)
    
    probVec <- c(1/2,1/6,1/6,1/6)
    
    cumProbVec <- cumsum(probVec)
    
    ret <- NULL
    
    for( i in 1:20000){
    
      rand <- runif(1)
    
      whichVec <- which(rand < cumProbVec)[1] 
    
      ret <- c(ret,sample(cumList[[whichVec]],1))
    
    }
    
    #Testing the results
    
    length(which(ret %in% A)) # Almost 1/2*20000 of the values
    
    length(which(ret %in% B)) # Almost 1/6*20000 of the values
    
    length(which(ret %in% C)) # Almost 1/6*20000 of the values
    
    length(which(ret %in% D)) # Almost 1/6*20000 of the values
    
    0 讨论(0)
提交回复
热议问题