Shuffle One Variable Within Group

对着背影说爱祢 提交于 2019-12-11 16:43:28

问题


This question is an extension of the excellent answer provided by Robert Picard here: How to Randomly Assign to Groups of Different Sizes

We have this dataset, which is the same as in the previous question, but adds the year variable:

sysuse census, clear
keep state region pop
order state pop region
decode region, gen(reg)
replace reg="NCntrl" if reg=="N Cntrl"
drop region
gen year=20 
replace year=30 if _n>15
replace year=40 if _n>35

If I just wanted to re-randomly assign reg's across all observations (without regard to group), I could implement the answer to the previous post:

tempfile orig
save `orig'
keep reg
rename reg reg_new
set seed 234
gen double u = runiform()
sort u reg_new
merge 1:1 _n  using `orig', nogen

How would the code be modified so that reg is shuffled, but only within year? For example, there are 15 observations where year==20. These observations should be shuffled separately than the other years.


回答1:


Shuffling one variable doesn't require any file choreography. This can probably be shortened:

sysuse auto, clear 
set seed 2803 

gen double shuffle = runiform() 

* example 1 
sort shuffle 
gen long which = _n 
sort mpg 
gen mpg_new = mpg[which] 
list which mpg* 

* example 2 
bysort foreign (shuffle) : gen long which2 = _n 
bysort foreign (mpg) : gen mpg2 = mpg[which2] 
list which2 mpg mpg2, sepby(foreign) 

All that said, I think sample does this so long as you specify the same sample size as the number in the dataset. It's overkill because you get all the variables.



来源:https://stackoverflow.com/questions/48887536/shuffle-one-variable-within-group

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!