I have two data.frames each with a time series. My goal ist to use the timeseries of df2 to mark the closest timestamp in df1. Each timestamp in df2 should only mark ONE tim
When running dt2[dt1, roll = "nearest"]
you are basically saying "return the rows from dt2
according to the nearest join to each row in dt1
using the key. So
dates2
in row one in dt2
is the nearest to dates1
in row one in dt1
dates2
in row one in dt2
is the nearest to dates1
in row two in dt1
dates2
in row two in dt2
is the nearest to dates1
in row three in dt1
Hence,
dt2[dt1, roll = "nearest"]
# dates2 values2 values1
# 1: 2015-10-26 12:00:00 A a
# 2: 2015-10-26 13:00:00 A b
# 3: 2015-10-26 14:00:00 C c
Which are all the rows from dt1
with the joined values2
from dt2
.
Instead, we want to join the other way around, namely "extract values2
from dt2
according to the nearest join by each row in dt2
using the key and update the matched rows in dt1
", namely
dt1[dt2, roll = "nearest", values2 := i.values2]
dt1
# dates1 values1 values2
# 1: 2015-10-26 12:00:00 a A
# 2: 2015-10-26 13:00:00 b NA
# 3: 2015-10-26 14:00:00 c C
Some additional notes
data.frame
and then to data.table
, you can just do dt1 <- data.table(dates1, values1)
and etc.key
parameter data.table
, namely dt1 <- data.table(dates1, values1, key = "dates1")
and etc.on
instead (V 1.9.6+), namely dt1[dt2, roll = "nearest", values2 := i.values2, on = c(dates1 = "dates2")]
<-
and data.table(df)
use :=
and setDT(df)
, see here for more information