Matching timestamped data to closest time in another dataset. Properly vectorized? Faster way?

后端 未结 2 2197
暖寄归人
暖寄归人 2020-12-28 14:53

I have a timestamp in one data frame that I am trying to match to the closest timestamp in a second dataframe, for the purpose of extracting data from the second dataframe.

2条回答
  •  忘掉有多难
    2020-12-28 15:49

    I wondered if this would be able to match a data.table solution for speed, but it's a base-R vectorized solution which should outperform your apply version. And since it doesn't actually ever calculate a distance, it might actually be faster than the data.table-nearest approach. This adds the length of the midpoints of the intervals to either the lowest possible value or the starting point of the the intervals to create a set of "mid-breaks" and then uses the findInterval function to process the times. That creates a suitable index into the rows of the reference dataset and the "refvalue" can then be "transferred" to the data-object.

     data$reefvalue <- reference$refvalue[
                          findInterval( data$datetime, 
                                         c(-Inf, head(reference$datetime,-1))+
                                         c(0, diff(as.numeric(reference$datetime))/2 )) ]
     # values are [1] 5 7 7 8
    

提交回复
热议问题