Summary of proportions by group

问题

What would be the best tool/package to use to calculate proportions by subgroups? I thought I could try something like this:

data(mtcars)
library(plyr)
ddply(mtcars, .(cyl), transform, Pct = gear/length(gear))

But the output is not what I want, as I would want something with a number of rows equal to cyl. Even if change it to summarise i still get the same problem.

I am open to other packages, but I thought plyr would be best as I would eventually like to build a function around this. Any ideas?

I'd appreciate any help just solving a basic problem like this.

回答1:

library(dplyr)

mtcars %>%
  count(cyl, gear) %>%
  mutate(prop = prop.table(n))

See ?count, basically, count is a wrapper for summarise with n() but it does the group by for you. Look at the output of just mtcars %>% count(cyl, gear). Then, we add an additional variable with mutate named prop which is the result of calling prop.table() on the n variable we created after as a result of count(cyl, gear).

You could create this as a function using the SE versions of count(), that is count_(). Look at the vignette for Non-Standard Evaluation in the dplyr package.

Here's a nice github gist addressing lots of cross-tabulation variants with dplyr and other packages.

回答2:

To get frequency within a group:

library(dplyr)
mtcars %>% count(cyl, gear) %>% mutate(Freq = n/sum(n))
# Source: local data frame [8 x 4]
# Groups: cyl [3]
# 
#     cyl  gear     n       Freq
#   (dbl) (dbl) (int)      (dbl)
# 1     4     3     1 0.09090909
# 2     4     4     8 0.72727273
# 3     4     5     2 0.18181818
# 4     6     3     2 0.28571429
# 5     6     4     4 0.57142857
# 6     6     5     1 0.14285714
# 7     8     3    12 0.85714286
# 8     8     5     2 0.14285714

or equivalently,

mtcars %>% group_by(cyl, gear) %>% summarise(n = n()) %>% mutate(Freq = n/sum(n))

Careful of what the grouping is at each stage, or your numbers will be off.

来源：https://stackoverflow.com/questions/37057784/summary-of-proportions-by-group

标签

dplyr

plyr