How can I get xtabs to calculate means instead of sums in R?

a 夏天 提交于 2019-11-27 02:19:32

问题


I have a data frame where each line represents an individual. That data frame has two variables: age and year. I want to make a table of average ages per year. How can I do it?

The best I could come up with was xtabs(age ~ year, dataframe), but this gives me the sum of ages per year.


回答1:


Use aggregate:

xtabs(hp~cyl+gear,aggregate(hp~cyl+gear,mtcars,mean))
   gear
cyl        3        4        5
  4  97.0000  76.0000 102.0000
  6 107.5000 116.5000 175.0000
  8 194.1667   0.0000 299.5000



回答2:


Have a look at the plyr package, specifically ddply

ddply(dataframe, .(year), summarise, mean(age))

If you have to use xtabs... see the above answer.




回答3:


I'm glad that xtabs solution works for you. I must have a package loaded that interferes with it for me (and gives an error). Another solution would be:

tapply(dfrm$age, dfrm$year, FUN=mean)

To get additional dimensions to the table (array) just keep adding additional factors to the second INDEX argument in the form list(fac1, fac2, fac3).

Applied to the example using mtcars:

 tapply(mtcars$hp, list(mtcars$cyl,mtcars$gear), mean)
         3     4     5
4  97.0000  76.0 102.0
6 107.5000 116.5 175.0
8 194.1667    NA 299.5

Or even more compactly:

with(mtcars, tapply(hp, list(cyl, gear), mean))



回答4:


Another solution is to use 2 xtabs functions, first one to calculate the sum of the ages, 2nd one to get the headcount for each group, then get the mean by sum/headcount.

E.g., xtabs(age ~ year, dataframe)/xtabs( ~ year, dataframe)

However, the aggregate approach can be used to calculate median or other functions applied to the dataframe. Definitely more flexible.



来源:https://stackoverflow.com/questions/9007741/how-can-i-get-xtabs-to-calculate-means-instead-of-sums-in-r

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!