问题
From a data frame of many columns, I would like to aggregate (i.e. sum
) hundreds of columns by a single column, without specifying each of the column names.
Some sample data:
names <- floor(runif(20, 1, 5))
sample <- cbind(names)
for(i in 1:20){
col <- rnorm(20,2,4)
sample <- cbind(sample, col)
}
What I have until now is the following code, but it gives me that arguments must be the same length.
aggregated <- aggregate.data.frame(sample[,c(2:20)], by = as.list(names), FUN = 'sum')
Original dataset is a lot bigger, so I can't specify the name of each of the columns to be aggregated and I can't use the list function.
回答1:
You don't actually need to list them at all:
aggregate(. ~ names, sample, sum) # . represents all other columns
Of course base R is my favorite but in case someone wants dplyr
:
library(dplyr)
data.frame(sample) %>%
group_by(names) %>%
summarise_each(funs(sum))
回答2:
Just alter your code slightly:
aggregated <- aggregate(sample[,c(2:20)], by = list(names), FUN = 'sum')
来源:https://stackoverflow.com/questions/44851154/aggregate-data-frame-with-many-columns-according-to-one-column