is there an equivalent to Stata's egen function? [duplicate]

独自空忆成欢 提交于 2019-12-04 11:04:32

问题


Stata has a very nice command, egen, which makes it easy to compute statistics over group of observation. For instance, it is possible to compute the max, the mean and the min for each group and add them as a variable in the detailed data set. The Stata command is one line of code :

by group : egen max = max(x)

I've never found the same command in R. summarise in the dplyr package makes it easy to compute statistics for each group but then I have to run a loop to associate the statistic to each observation :

library("dplyr")
N  <- 1000
tf  <- data.frame(group = sample(1:100, size = N, replace = TRUE), x = rnorm(N))
table(tf$group)
mtf  <- summarise(group_by(tbl_df(tf), group), max = max(x))
tf$max  <- NA
for (i in 1:nrow(mtf)) {
  tf$max[tf$group == mtf$group[i]]  <- mtf$max[i]
}

Does any one has a better solution ?


回答1:


Here are a few approaches:

dplyr

library(dplyr)

tf %>% group_by(group) %>% mutate(max = max(x))

ave

This uses only the base of R:

transform(tf, max = ave(x, group, FUN = max))

data.table

library(data.table)

dt <- data.table(tf)
dt[, max:=max(x), by=group]


来源:https://stackoverflow.com/questions/24161489/is-there-an-equivalent-to-statas-egen-function

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!