I am seeking help after having wasted almost a day. I have a big data frame (bdf) and a small data frame (sdf). I want to add variable z to bdf depending on the value of sdf
Edit note: I initially get a slightly different result than you did which I now think was related to my lack of understanding of R difftime objects. Timezones in POSIXt objects also remain a mystery to me but I now see that when I coerced a 'difftime' object to 'numeric' that I got the value in "days".
The findInterval function is very useful as an index creation function that maps a values-vector where one has multiple adjoining non overlapping intervals. You really only have two time-points that split into three intervals.
bdf$z <- c(0.2,-0.1,0.3)[findInterval(bdf$tb,
c(-Inf,
sdf$ts[2] - 0.5*as.numeric(difftime(sdf$ts[2], sdf$ts[1], units="secs")),
sdf$ts[3] - 0.5*as.numeric(difftime(sdf$ts[3], sdf$ts[2],units="sec")),
Inf))]
> bdf
tb z
1 2013-05-19 17:11:22 0.2
2 2013-05-21 06:40:58 0.2
3 2013-05-22 20:10:34 0.2
4 2013-05-24 09:40:10 -0.1
5 2013-05-25 23:09:46 -0.1
6 2013-05-27 12:39:22 0.3
7 2013-05-29 02:08:58 0.3
8 2013-05-30 15:38:34 0.3
9 2013-06-01 05:08:10 0.3
10 2013-06-02 18:37:46 0.3
I also checked to see if my result would be affected by whether the intervals in findIntervals were closed on their right rather than the left (default) and saw no difference.