sample integer values with specified mean

五迷三道 提交于 2020-01-04 14:19:17

问题


I want to generate a sample of integer numbers in R with a specified mean. I used mu+sd*scale(rnorm(n)) to generate a sample of n values that has exactly the mean=mu but this generates floating-point values; I would like to generate integer values instead. For example, I would like to generate a sample of mean=4. My sample size n=5, an example of generated values would be {2,6,4,3,5}. Any ideas on how to do this in R while satisfying the constraint of a specific value of the mean?


回答1:


Picking n values with a mean of m is equivalent to picking n values that sum to m*n. (I'm assuming you're going to stick to positive integers -- otherwise things get much harder!) Here is a solution based on sampling partitions (sets of values that add up to the desired total) uniformly, but I'm not sure it's what you want, since it doesn't sample uniformly over values, but over partitions ... perhaps someone else can do better, or figure out how to reweight the samples.

This brute-force solution will also probably fail for cases much larger than your example (there are 627 partitions for a total of 20, 5604 for a total of 30, 37338 for a total of 40 ...)

m <- 4
n <- 5
library("partitions")    
pp <- parts(m*n) ## all sets of integers that sum to m*n (=20 here)
## restrict to partitions with exactly n (=5) non-zero values.
pp5 <- pp[1:5,colSums(pp>0)==n]
set.seed(101) ## for reproducibility
## sample uniformly from this set
pp5[,sample(ncol(pp5),size=1)]  ## 9, 5, 4, 1, 1


来源:https://stackoverflow.com/questions/26569456/sample-integer-values-with-specified-mean

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!