How can I create a stratified sample in R using the \"sampling\" package? My dataset has 355,000 observations. The code works fine up to the last line. Below is the code I w
Without knowing of the strata function - a bit of coding might do what want:
d <- expand.grid(id = 1:35000, stratum = letters[1:10])
p = 0.1
dsample <- data.frame()
system.time(
for(i in levels(d$stratum)) {
dsub <- subset(d, d$stratum == i)
B = ceiling(nrow(dsub) * p)
dsub <- dsub[sample(1:nrow(dsub), B), ]
dsample <- rbind(dsample, dsub)
}
)
# size per stratum in resulting df is 10 % of original size:
table(dsample$stratum)
HTH, Kay
ps: CPU time on my relict laptop is 0.09!