Take randomly sample based on groups

前端 未结 8 698
说谎
说谎 2020-11-28 13:23

I have a df made by almost 50,000 rows spread in 15 different IDs (every ID has thousands of observations). df looks like:

        ID  Year    Temp    ph
1           


        
8条回答
  •  孤街浪徒
    2020-11-28 14:12

    mydata1 is your original data(not tested)
    
    mydata2<- split(mydata1,mydata1$ID)
    names(mydata2)<-paste0("mydata2",1:length(levels(ID))) 
    mysample<-Map(function(x) x[sample((1:nrow(x)),size=500,replace=FALSE),], mydata2)
    
    library(plyr)# for rbinding the mysample
    ldply(mysample)
    

提交回复
热议问题