Calculate percentages of a binary variable BY another variable in R

后端 未结 4 1680
攒了一身酷
攒了一身酷 2020-12-12 01:37

I want to summarise the percentage of people that have been treated BY region.

I have created a dummy dataset for this purpose:

id <- seq(1:1000)
         


        
4条回答
  •  遥遥无期
    2020-12-12 02:18

    For completeness, here's how you can do it using ddply() from plyr:

    library(plyr)
    ddply(d[!is.na(d$id),],.(region),summarize,
          N = length(region),
          prop=mean(treatment==1))
    #   region   N prop
    # 1      A 200  0.5
    # 2      B 200  0.5
    # 3      C 200  0.5
    # 4      D 200  0.5
    # 5      E 200  0.5
    

    This assumes that you want to deal with the NA values in id by removing the observation.

提交回复
热议问题