I want to summarise the percentage of people that have been treated BY region.
I have created a dummy dataset for this purpose:
id <- seq(1:1000)
You could also use data.table:
library(data.table) setDT(d)[,.(.N,prop=sum(treatment==2)/.N), by=region] region N prop 1: A 200 0.5 2: B 200 0.5 3: C 200 0.5 4: D 200 0.5 5: E 200 0.5