Randomly select groups (and all cases per group) in R?

时光怂恿深爱的人放手 提交于 2019-12-06 11:24:00

This is pretty straight forward if you use sample and then index. Here's a made up example that looks similar to what you've presented. It's really only two lines of code and could be done in one if you wanted.

dat <- data.frame(id=paste0(LETTERS[1:8], rep(1:1250, 8)), 
   year=as.factor(as.character(sample(c(1990:2012, 20000, T)))), 
   var1=rnorm(20000), var2=rnorm(20000))

#a look at the data
head(dat)

#sample 20 id's randomly
(ids <- sample(unique(dat$id), 20))

#narrow your data set
dat2 <- dat[dat$id %in% ids, ]
subset(df, id %in% sample(levels(df$id), 20))

that's assuming your data frame is called df and that your id is a factor (use unique instead of levels if it's not)

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!