How does one aggregate and summarize data quickly?
I have a dataset whose headers look like so: PID Time Site Rep Count I want sum the Count by Rep for each PID x Time x Site combo on the resulting data.frame, I want to get the mean value of Count for PID x Time x Site combo. Current function is as follows: dummy <- function (data) { A<-aggregate(Count~PID+Time+Site+Rep,data=data,function(x){sum(na.omit(x))}) B<-aggregate(Count~PID+Time+Site,data=A,mean) return (B) } This is painfully slow (original data.frame is 510000 20) . Is there a way to speed this up with plyr? Ramnath You should look at the package data.table for faster aggregation