relative windowed running sum through data.table non-equi join

后端 未结 2 1595
-上瘾入骨i
-上瘾入骨i 2021-01-05 13:35

I have a data set customerId, transactionDate, productId, purchaseQty loaded into a data.table. for each row, I want to calculate the sum, and mean of purchaseQty for the pr

2条回答
  •  迷失自我
    2021-01-05 13:58

    This also works, it could be considered simpler. It has the advantage of not requiring a sorted input set, and has fewer dependencies.

    I still don't know understand why it produces 2 transactionDate columns in the output. This seems to be a byproduct of the "on" clause. In fact, columns and order of the output seems to append the sum after all elements of the on clause, without their alias names

    DT[.(p=productId, c=customerID, tmin=transactionDate - 45, tmax=transactionDate),
        on = .(productId==p, customerID==c, transactionDate<=tmax, transactionDate>=tmin),
        .(windowSum = sum(purchaseQty)), by = .EACHI, nomatch = 0]
    

提交回复
热议问题