calculate a mean by criteria in R

家住魔仙堡 提交于 2019-12-11 01:29:23

问题


I would like to calculate a sample mean in R by introducing a specific criteria. For example I have this table and I want the means of only those for whom stage = 1 or 2:

treatment session period stage wage_accepted type 
1            1      1     1            25  low 
1            1      1     3            19  low 
1            1      1     3            15  low 
1            1      1     2            32 high 
1            1      1     2            13  low 
1            1      1     2            14  low 
1            1      2     1            17  low 
1            1      2     4            16  low
1            1      2     5            21  low

The desired out in this case should be:

   stage  mean
      1  21.0 
      2  19.6667

Thanks in advance.


回答1:


With the dplyr library

library(dplyr)

df %>% filter(stage==1 | stage ==2) %>% group_by(stage) %>%
  summarise(mean=mean(wage_accepted))

If you are new to dplyr a bit of explanation:

Take the data frame df then filter where stage is equal to 1 or 2. Then for each group in stage calculate the mean of the wage_accepted




回答2:


Check this out. It's a toy example, but data.table is so compact. dplyr is great as well obviously.


    library(data.table)

    dat <- data.table(iris)
    dat[Species == "setosa" | Species == "virginica", mean(Sepal.Width), by = Species]

In terms of your need for speed... data.table is a rocket ship look it up. I'll leave it to you to apply this to your question. Best, M2K




回答3:


Assuming you have a csv file for the data, you can read data into a data frame using:

data<-read.csv("PATH_TO_YOUR_CSV_FILE/Name_of_the_CSV_File.csv")

Then you can use either this code relying on sapply():

sapply(split(data$Wage_Accepted,data$Stage),mean)

   1        2        3        4        5 
21.00000 19.66667 17.00000 16.00000 21.00000 

Or this code relying on tapply():

tapply(data$Wage_Accepted,data$Stage,mean)

   1        2        3        4        5 
21.00000 19.66667 17.00000 16.00000 21.00000 



回答4:


You can do this and then later filter for Stages as per your requirement

# Calculating mean with respect to stages
df = do.call(rbind, lapply(split(data, f = data$stage),function(x) out = data.frame(stage = unique(x$stage), mean = mean(x$wage_accepted))))

# mean for stage 1 and 2
required = subset(df, stage %in% c(1,2))


来源:https://stackoverflow.com/questions/29724286/calculate-a-mean-by-criteria-in-r

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!