How to calculate the mean of specific rows in R?

夙愿已清 提交于 2019-12-02 11:37:27

问题


I have a data file as following example but much more larger

names    num    Y1  Y2
William  1  4.71    7.4
William  2  3.75    8
William  3  4.71    7.9
Katja    1  5.83    8.5
Katja    2  5.17    7.1
Katja    3  6.08    7.4
Aroma    1  4.04    7.5
Aroma    2  5       6.9
Aroma    3  4.3     7.9
...

I have to calculate the mean for each 3 of the same names (first column) for Y1 and Y2. And then make a bar chart by the average of each name with Y1 and Y2, separately. So on the x axis I will have the names and on the y axis the mean. Could anybody help me with this?


回答1:


You can also use aggregate. See ?aggregate for further details.

> aggregate(.~names, FUN=mean, data=df[, -2])
    names       Y1       Y2
1   Aroma 4.446667 7.433333
2   Katja 5.693333 7.666667
3 William 4.390000 7.766667

Take a look at this post for another alternatives of taking mean for each group.

For the bar plots use R base barplot function although there other alternatives such as ggplot2 graphics.

barplot(DF[,2], names.arg=DF$names, ylab="mean of Y1", las=1) # for Y1
barplot(DF[,3], names.arg=DF$names, ylab="mean of Y2", las=1) # for Y2

which produce:

As you are very new to R, I recommend to read An introduction to R which is a good starting point you to learn the basics of R.




回答2:


Using the sqldf package (assuming df is your table)

library(sqldf)
sqldf("SELECT names, avg(Y1) as mean_Y1, avg(Y2) as mean_Y2 FROM df GROUP BY names")


来源:https://stackoverflow.com/questions/18765253/how-to-calculate-the-mean-of-specific-rows-in-r

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!