Group androgynous names and sum amount for each year in a data frame in R

青春壹個敷衍的年華 提交于 2019-12-13 05:25:03

问题


I have a data frame with 4 columns titled 'year' 'name' 'sex' 'amount'. Here is a sample data set

set.seed(1)
    data = data.frame(year=sample(1950:2000, 50, replace=TRUE),name=sample(LETTERS, 50, replace=TRUE),
                       sex=sample(c("M", "F"), 50, replace=TRUE), amount=sample(40:100, 50, replace=TRUE))

I want to find only names that occur as both an ‘m’ and an ‘f’ and sum the amount for each year.

Any help would be greatly appreciated


回答1:


I changed the data a bit, so that there would be common names in some years. On visual inspection this seems to work. The syntax may appear a bit cryptic if you are not familiar with data.table but it's concise and should be quite efficient:

require(data.table)
set.seed(1)
data = data.frame(year=sample(1950:1960, 100, replace=TRUE),name=sample(LETTERS, 100, replace=TRUE),
                  sex=sample(c("M", "F"), 100, replace=TRUE), amount=sample(40:100, 100,  replace=TRUE))
setDT(data) ## change to data.table format
data[, common=sum(amount[name %in% intersect(name[sex=="M"], name[sex=="F"])]), by=year]


来源:https://stackoverflow.com/questions/26347343/group-androgynous-names-and-sum-amount-for-each-year-in-a-data-frame-in-r

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!