Random sampling without replacement in longitudinal data
My data is longitudinal. VISIT ID VAR1 1 001 ... 1 002 ... 1 003 ... 1 004 ... ... 2 001 ... 2 002 ... 2 003 ... 2 004 ... Our end goal is picking out 10% each visit to run a test. I tried to use proc SURVEYSELECT to do SRS without replacement and using "VISIT" as strata. But the final sample would have duplicated IDs. For example, ID=001 might be selected both in VISIT=1 and VISIT=2. Is there any way to do that using SURVEYSELECT or other procedure (R is also fine)? Thanks a lot. This is possible with some fairly creative data step programming. The code below uses a greedy approach, sampling