I have a data frame with 3 columns: custId, saleDate, DelivDateTime.
> head(events22)
custId saleDate DelivDate
1 280356593 2012-11-1
The fastest between ddply and aggregate, I suppose would be aggregate, especially on huge data as you have. However, the fastest would be data.table.
require(data.table)
dt <- data.table(events22)
dt[, .SD[which.max(saleDate),], by=custId]
From ?data.table: .SD is a data.table containing the subset of x's
Data for each group, excluding the group column(s).