Find which interval row in a data frame that each element of a vector belongs in

后端 未结 7 792
一整个雨季
一整个雨季 2020-11-29 06:29

I have a vector of numeric elements, and a dataframe with two columns that define the start and end points of intervals. Each row in the dataframe is one interval. I want to

7条回答
  •  眼角桃花
    2020-11-29 06:51

    David Arenburg's mention of non-equi joins was very helpful for understanding what general kind of problem this is (thanks!). I can see now that it's not implemented for dplyr. Thanks to this answer, I see that there is a fuzzyjoin package that can do it in the same idiom. But it's barely any simpler than my map solution above (though more readable, in my view), and doesn't hold a candle to thelatemail's cut answer for brevity.

    For my example above, the fuzzyjoin solution would be

    library(fuzzyjoin)
    library(tidyverse)
    
    fuzzy_left_join(data.frame(elements), intervals, 
                    by = c("elements" = "start", "elements" = "end"), 
                    match_fun = list(`>=`, `<=`)) %>% 
      distinct()
    

    Which gives:

        elements phase start end
    1      0.1     a     0   0.5
    2      0.2     a     0   0.5
    3      0.5     a     0   0.5
    4      0.9      NA    NA
    5      1.1     b     1   1.9
    6      1.9     b     1   1.9
    7      2.1     c     2   2.5
    

提交回复
热议问题