Cannot understand why random number algorithms give different results

寵の児 提交于 2020-02-07 05:44:06

问题


I'm trying to understand why the following 2 code snippets lead to different output and I'm hoping someone can enlighten me. The second version is marginally faster - which is my motivation since this all sits in a loop, but I need to know if it is equivalent and if not then why ?

    things <- 100000
    n.events <- 3
    probs <- rep(0.05, n.events)

    # Method 1
    set.seed(1)
    events1 <- matrix(n.events, nrow=things)
    events1 <- sapply(probs, function(probs) rbinom(things,1,probs))

    # Method 2
    set.seed(1)
    events2 <- matrix( rbinom(things*n.events, 1, probs), ncol=n.events,    byrow=TRUE)

    > identical(events1, events2)
    [1] FALSE

回答1:


To have the same result, you need to put byrow=FALSE instead of byrow=TRUE for the second option.

the first way you're computing the random matrix, it computes rbinom(things,1,probs) length(probs) times, obtaining each time a things long vector.

To obtain the exact same matrix with the second way, which computes a length(probs)*things long vector, you need to "organize" it in length(probs) vectors of length things.

(either way, seed is "treated" exactly the same "in between" the computations so the numbers obtained are the same)

EDIT

If you want to use different values for the vector probs, you need to compute events2 in a way that the probs are used in the same "order" than for events1 so first the first value of vector probs, until the desired length is reached, then the second value, etc. and then, organize the columns of your matrix the same way as before, so using byrow=FALSE.

Example

# define the parameters:
things <- 100000
n.events <- 3
probs <- c(0.05, 0.01, 0.20)

# compute the random vectors with sapply
set.seed(1)
events1 <- sapply(probs, function(probs) rbinom(things,1,probs))

# compute the random vectors "directly", using the "right" probs
set.seed(1)
events2 <- matrix(rbinom(things*n.events, 1, rep(probs, e=things)), ncol=n.events,    byrow=FALSE)

# check the results
identical(events1, events2)
[1] TRUE


来源:https://stackoverflow.com/questions/30805012/cannot-understand-why-random-number-algorithms-give-different-results

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!