Cumulative sum with lag
I have a very large dataset that looks simplified like this: row. member_id entry_id comment_count timestamp 1 1 a 4 2008-06-09 12:41:00 2 1 b 1 2008-07-14 18:41:00 3 1 c 3 2008-07-17 15:40:00 4 2 d 12 2008-06-09 12:41:00 5 2 e 50 2008-09-18 10:22:00 6 3 f 0 2008-10-03 13:36:00 I can aggregate the counts with the following code: transform(df, aggregated_count = ave(comment_count, member_id, FUN = cumsum)) But I want a lag of 1 in the cumulated data, or I want cumsum to ignore the current row. The result should be: row. member_id entry_id comment_count timestamp previous_comments 1 1 a 4 2008