Efficient way to perform running total in the last 365 day window

前端 未结 3 1253
梦如初夏
梦如初夏 2021-01-06 10:40

This is what my data frame looks like:

library(data.table)

df <- fread(\'
                Name  EventType  Date  SalesAmount RunningTotal Runningt         


        
3条回答
  •  予麋鹿
    予麋鹿 (楼主)
    2021-01-06 11:38

    Here's an approach using foverlaps function from data.table package:

    require(data.table)
    setDT(df)[, end := as.Date(EventDate, format="%d/%m/%Y")
            ][, start := end - 365L]
    setkey(df, Name, start, end)
    olaps = foverlaps(df, df, nomatch=0L, which=TRUE)
    olaps = olaps[xid >= yid, .(ans = sum(dt$SalesAmount[yid])), by=xid]
    
    df[olaps$xid, Runningtotal := olaps$ans]
    

    You can remove the start and end columns, if necessary, by doing:

    df[, c("start", "end") := NULL]
    

    Would be nice to know how fast/slow it is..

提交回复
热议问题