Join tables based on multiple ranges in R

扶醉桌前 提交于 2019-11-29 12:14:17

with tidyverse, you can try something like:

data %>%
  inner_join(params) %>%
  filter( time > valid_from & time < valid_to) %>%
  filter( ang > angle_begin & ang < angle_end)

With data.table this is a non-equi join:

library(data.table)
# coerce to data.table
setDT(params)
setDT(data)

# keep only rows of data with matches in params
data[params, 
     on = .(id, time >= valid_from, time <= valid_to, ang >= angle_begin, ang <= angle_end),
     .(id, time = x.time, ang = x.ang, param)]
    id time        ang param
 1:  1  2.0 140.383052     A
 2:  1  3.5 152.772925     A
 3:  1  8.0 141.039548     A
 4:  2  1.0 104.434264     B
 5:  2  2.0 140.383052     B
 6:  2  3.5 152.772925     B
 7:  2  8.0 141.039548     B
 8:  2 16.0 150.424306     B
 9:  2 16.5  92.201187     B
10:  ...
41:  4 22.0  89.813795     D
42:  4 22.5 131.004229     D
43:  4 26.0  79.839443     D
44:  4 27.5 128.291356     D
45:  4 29.0 127.942287     D
46:  4 30.0 136.388594     D
47:  4 32.0 140.092817     D
48:  4 32.5 108.346831     D
49:  4 37.0 140.732844     D
    id time        ang param

If all rows of data should be kept

params[data, 
       on = .(id, valid_from <= time, valid_to >= time, angle_begin <= ang, angle_end >= ang), 
       .(id, time = i.time, ang = i.ang, param)]
     id time       ang param
  1:  1  0.5 106.62639    NA
  2:  1  1.0 104.43426    NA
  3:  1  1.5  15.77429    NA
  4:  1  2.0 140.38305     A
  5:  1  2.5 322.31929    NA
 ---                        
396:  4 48.0 131.17405    NA
397:  4 48.5 335.47857    NA
398:  4 49.0 181.64450    NA
399:  4 49.5  90.96224    NA
400:  4 50.0  60.04268    NA

Given your wording, I interpret your question differently. That is, I read that you want to keep all rows but join only when the values of valid_ and angle_ are within the specified range. Also note that depending on whether you want to include values at the boundaries of the ranges you may need >= and <= instead of > and <.

Thus, starting from Aramis7d's answer:

data %>%
  inner_join(params, by = "id") %>%
  mutate(param = ifelse(
           time >= valid_from & time <= valid_to & 
             ang >= angle_begin & ang <= angle_end,
           param,
           NA))
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!