问题
I am asked to "simulate x as an independent identically distributed (iid) normal variable with mean=0, std=1.5 with sample length 500"
I am doing the sampling in following two ways:
set.seed(8402)
X <- rnorm(500, 0, 1.5)
head(X)
and I got
-1.8297969 -0.1862884 1.4219400 -1.0841421 -1.5276701 1.6159368
However, if I do
X <- replicate(500, rnorm(1,0,1.5))
head(X)
and I got
-0.04032755 0.92002552 -2.28001943 -1.36840869 1.49820718 0.06205003
My question is what is the right way to generate iid normal variable? What is the difference between those two ways?
Many thanks!
回答1:
R Internal
Internally in R, the C function from <Rmath.h>: double rnorm (double mean, double sd)
function generates one random number at a time. When you call its R wrapper function rnorm(n, mean, sd)
, it calls the C level function n
times.
This is as same as you call R level function only once with n = 1
, but replicate it n
times using replicate
.
The first method is much faster (possibly the difference will be seen when n
is really large), as everything is done at C level. replicate
however, is a wrapper of sapply
, so it is not really a vectorized function (read on Is the "*apply" family really not vectorized?).
In addition, if you set the same random seed for both, you are going to get the same set of random numbers.
A more illustrative experiment
In my comment below, I say that random seed is only set once on entry. To help people understand this, I provide this example. There is no need to use large n
. n = 4
is sufficient.
First, let's set seed at 0, while generating 4 standard normal samples:
set.seed(0); rnorm(4, 0, 1)
## we get
[1] 1.2629543 -0.3262334 1.3297993 1.2724293
Note that in this case, all 4 numbers are obtained from the entry seed 0.
Now, let's do this:
set.seed(0)
rnorm(2, 0, 1)
## we get
[1] 1.2629543 -0.3262334
## do not reset seed, but continue with the previous seed
replicate(2, rnorm(1, 0, 1))
## we get
[1] 1.329799 1.272429
See?
But if we reset seed in the middle, for example, set it back to 0
set.seed(0)
rnorm(2, 0, 1)
## we get
[1] 1.2629543 -0.3262334
## reset seed
set.seed(0)
replicate(2, rnorm(1, 0, 1))
## we get
[1] 1.2629543 -0.3262334
This is what I mean by "entry".
来源:https://stackoverflow.com/questions/36428415/what-is-difference-between-replicate-n-times-and-generate-n-directly-in-sampling