问题
I have a data table that looks like this:
serialno state type type2
1 100 FL A C
2 100 CA A D
3 101 CA B D
4 102 GA A C
5 103 WA A C
6 103 PA B C
7 104 CA B D
8 104 CA B C
9 105 NY A D
10 105 NJ B C
I need to create a new data table that is aggregated by serialno
but calculates the count of each type of existing variables. So the end result would look like this.
FL CA GA A B C D
100 1 1 2 1 1
101 1 1 1 1
102 1 1
103 1 1 1 1 2
104 2 2 1 1
105 1 1 1 1 1 1
I'm sure there is a solution using some combination of group_by
and summarize
, but I haven't been able to figure this out. Is the easiest solution just to create first spread out the state
, type
, and type2
columns and then use summarize to create counts?
回答1:
Using:
library(reshape2)
recast(dat, serialno ~ value, id.var = 'serialno', fun.aggregate = length)
gives:
serialno A B C CA D FL GA NJ NY PA WA
1 100 2 0 1 1 1 1 0 0 0 0 0
2 101 0 1 0 1 1 0 0 0 0 0 0
3 102 1 0 1 0 0 0 1 0 0 0 0
4 103 1 1 2 0 0 0 0 0 0 1 1
5 104 0 2 1 2 1 0 0 0 0 0 0
6 105 1 1 1 0 1 0 0 1 1 0 0
Or:
library(dplyr)
library(tidyr)
dat %>%
gather(key, val, state:type2) %>%
group_by(serialno, val) %>%
tally() %>%
spread(val, n, fill = 0)
Or:
library(data.table)
dcast(melt(setDT(dat), id = 1), serialno ~ value, fun.aggregate = length)
Or (per @Frank's suggestion):
res <- table(melt(dat, id="serialno")[, c("serialno","value")])
print(res, zero.print="")
Of which the last one results in:
value
serialno A B C CA D FL GA NJ NY PA WA
100 2 1 1 1 1
101 1 1 1
102 1 1 1
103 1 1 2 1 1
104 2 1 2 1
105 1 1 1 1 1 1
来源:https://stackoverflow.com/questions/43262085/using-group-by-create-aggregated-counts-conditional-on-value