I have a data frame as follows:
Category Name Value
How would I select say, 5 random names per category? Using sample
returns random ro
If you want the same number of items from each category, this is easy:
df[unlist(tapply(1:nrow(df),df$Category,function(x) sample(x,3))),]
e.g., I generated df
as follows:
df <- data.frame(Category=rep(1:5,each=20),Name=1:100,Value=rnorm(100))
then I get the follow from my code:
> df[unlist(tapply(1:nrow(df),df$Category,function(x) sample(x,3))),]
Category Name Value
5 1 5 0.25151044
20 1 20 1.52486482
18 1 18 0.69313462
30 2 30 0.73444185
27 2 27 0.24000427
39 2 39 -0.10108203
46 3 46 -0.37200574
49 3 49 -1.84920469
43 3 43 0.35976388
68 4 68 0.57879516
76 4 76 -0.11049302
64 4 64 -0.13471303
100 5 100 0.95979408
95 5 95 -0.01928741
99 5 99 0.85725242
If you want different numbers of rows from each category it will be more complicated.