Selecting random rows by category from a data frame?

前端 未结 3 1398
一个人的身影
一个人的身影 2021-01-22 17:06

I have a data frame as follows:

Category Name Value

How would I select say, 5 random names per category? Using sample returns random ro

3条回答
  •  不要未来只要你来
    2021-01-22 17:18

    If you want the same number of items from each category, this is easy:

    df[unlist(tapply(1:nrow(df),df$Category,function(x) sample(x,3))),]
    

    e.g., I generated df as follows:

    df <- data.frame(Category=rep(1:5,each=20),Name=1:100,Value=rnorm(100))
    

    then I get the follow from my code:

    > df[unlist(tapply(1:nrow(df),df$Category,function(x) sample(x,3))),]
        Category Name       Value
    5          1    5  0.25151044
    20         1   20  1.52486482
    18         1   18  0.69313462
    30         2   30  0.73444185
    27         2   27  0.24000427
    39         2   39 -0.10108203
    46         3   46 -0.37200574
    49         3   49 -1.84920469
    43         3   43  0.35976388
    68         4   68  0.57879516
    76         4   76 -0.11049302
    64         4   64 -0.13471303
    100        5  100  0.95979408
    95         5   95 -0.01928741
    99         5   99  0.85725242
    

    If you want different numbers of rows from each category it will be more complicated.

提交回复
热议问题