Selecting random rows by category from a data frame?

前端 未结 3 1394
一个人的身影
一个人的身影 2021-01-22 17:06

I have a data frame as follows:

Category Name Value

How would I select say, 5 random names per category? Using sample returns random ro

3条回答
  •  無奈伤痛
    2021-01-22 17:19

    Best guess in absence of test cases:

      do.call( rbind, lapply( split(dfrm, df$cat) ,
                             function(df) df[sample(nrow(df), 5) , ] )
              )
    

    Tested with Jonathan's data:

    > do.call( rbind, lapply( split(df, df$Category) ,
    +                          function(df) df[sample(nrow(df), 5) , ] )
    +           )
    
          Category Name      Value   
    1.8          1    8 -0.2496109   #  useful side-effect of labeling source group
    1.15         1   15 -0.4037368
    1.17         1   17 -0.4223724
    1.12         1   12 -0.9359026
    1.18         1   18  0.3741184
    2.37         2   37  0.3033610
    2.34         2   34 -0.4517738
    2.36         2   36 -0.7695923
    snipped remainder
    

提交回复
热议问题