Creating variable in R data frame depending on another data frame

前端 未结 4 713
后悔当初
后悔当初 2020-12-31 20:34

I am seeking help after having wasted almost a day. I have a big data frame (bdf) and a small data frame (sdf). I want to add variable z to bdf depending on the value of sdf

4条回答
  •  余生分开走
    2020-12-31 21:02

    Edit note: I initially get a slightly different result than you did which I now think was related to my lack of understanding of R difftime objects. Timezones in POSIXt objects also remain a mystery to me but I now see that when I coerced a 'difftime' object to 'numeric' that I got the value in "days".

    The findInterval function is very useful as an index creation function that maps a values-vector where one has multiple adjoining non overlapping intervals. You really only have two time-points that split into three intervals.

    bdf$z <- c(0.2,-0.1,0.3)[findInterval(bdf$tb, 
                    c(-Inf, 
      sdf$ts[2] - 0.5*as.numeric(difftime(sdf$ts[2], sdf$ts[1], units="secs")), 
      sdf$ts[3] - 0.5*as.numeric(difftime(sdf$ts[3], sdf$ts[2],units="sec")), 
                     Inf))]
    
    > bdf
                        tb    z
    1  2013-05-19 17:11:22  0.2
    2  2013-05-21 06:40:58  0.2
    3  2013-05-22 20:10:34  0.2
    4  2013-05-24 09:40:10 -0.1
    5  2013-05-25 23:09:46 -0.1
    6  2013-05-27 12:39:22  0.3
    7  2013-05-29 02:08:58  0.3
    8  2013-05-30 15:38:34  0.3
    9  2013-06-01 05:08:10  0.3
    10 2013-06-02 18:37:46  0.3
    

    I also checked to see if my result would be affected by whether the intervals in findIntervals were closed on their right rather than the left (default) and saw no difference.

提交回复
热议问题