Given a dataframe df with a column called group, how do you randomly sample k groups from it in dplyr? It should return all rows from k groups (given there are at least k unique values in df$group), and every group in df should be equally likely to be returned.
可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试):
问题:
回答1:
Just use sample() to choose some number of groups
iris %>% filter(Species %in% sample(levels(Species),2)) 回答2:
Though why you'd want to do this in dplyr makes no sense to me:
library(microbenchmark) microbenchmark(dplyr= iris %>% filter(Species %in% sample(levels(Species),2)), base= iris[iris[["Species"]] %in% sample(levels(iris[["Species"]]), 2),]) Unit: microseconds expr min lq mean median uq max neval cld dplyr 660.287 710.655 753.6704 722.629 771.2860 1122.527 100 b base 83.629 95.032 110.0936 106.057 119.1715 199.949 100 a Note [[ is known to be faster than $, although both work
文章来源: Randomly sample groups