R use ddply or aggregate

后端 未结 4 1099
渐次进展
渐次进展 2020-12-01 21:58

I have a data frame with 3 columns: custId, saleDate, DelivDateTime.

> head(events22)
     custId            saleDate      DelivDate
1 280356593 2012-11-1         


        
4条回答
  •  再見小時候
    2020-12-01 22:11

    The fastest between ddply and aggregate, I suppose would be aggregate, especially on huge data as you have. However, the fastest would be data.table.

    require(data.table)
    dt <- data.table(events22)
    dt[, .SD[which.max(saleDate),], by=custId]
    

    From ?data.table: .SD is a data.table containing the subset of x's Data for each group, excluding the group column(s).

提交回复
热议问题