R data.table rolling join “mult” not working as expected

前端 未结 1 986
傲寒
傲寒 2020-12-11 08:34

I have two data.frames each with a time series. My goal ist to use the timeseries of df2 to mark the closest timestamp in df1. Each timestamp in df2 should only mark ONE tim

相关标签:
1条回答
  • 2020-12-11 08:50

    When running dt2[dt1, roll = "nearest"] you are basically saying "return the rows from dt2 according to the nearest join to each row in dt1 using the key. So

    • dates2 in row one in dt2 is the nearest to dates1 in row one in dt1
    • dates2 in row one in dt2 is the nearest to dates1 in row two in dt1
    • dates2 in row two in dt2 is the nearest to dates1 in row three in dt1

    Hence,

    dt2[dt1, roll = "nearest"]
    #                 dates2 values2 values1
    # 1: 2015-10-26 12:00:00       A       a
    # 2: 2015-10-26 13:00:00       A       b
    # 3: 2015-10-26 14:00:00       C       c
    

    Which are all the rows from dt1 with the joined values2 from dt2.


    Instead, we want to join the other way around, namely "extract values2 from dt2 according to the nearest join by each row in dt2 using the key and update the matched rows in dt1", namely

    dt1[dt2, roll = "nearest", values2 := i.values2] 
    dt1
    #                 dates1 values1 values2
    # 1: 2015-10-26 12:00:00       a       A
    # 2: 2015-10-26 13:00:00       b      NA
    # 3: 2015-10-26 14:00:00       c       C
    

    Some additional notes

    • You don't need to wrap first to data.frame and then to data.table, you can just do dt1 <- data.table(dates1, values1) and etc.
    • While you at it, you can already set the key on the fly using key parameter data.table, namely dt1 <- data.table(dates1, values1, key = "dates1") and etc.
    • Or you can skip setting keys all together and use on instead (V 1.9.6+), namely dt1[dt2, roll = "nearest", values2 := i.values2, on = c(dates1 = "dates2")]
    • Finally, please refrain from making unnecessary copies, e.g., instead of <- and data.table(df) use := and setDT(df), see here for more information
    0 讨论(0)
提交回复
热议问题