I want to cross-join two data tables without evaluating the full cross join, using a ranging criterion in the process. In essence, I would like CJ with filtering/ranging ex
This seems like a problem that could benefit a lot from using interval trees algorithm. A very nice implementation is available from the bioconductor package IRanges.
# Installation
source("http://bioconductor.org/biocLite.R")
biocLite("IRanges")
# solution
require(IRanges)
ir1 <- IRanges(dt1$D, width=1L)
ir2 <- IRanges(dt2$D1, dt2$D2)
olaps <- findOverlaps(ir1, ir2, type="within")
cbind(dt1[queryHits(olaps)], dt2[subjectHits(olaps)])
id1 D id2 D1 D2
1: 3 6 21 5 9
2: 4 8 21 5 9
3: 4 8 22 7 12
4: 5 10 22 7 12
5: 5 10 23 10 16
6: 6 12 22 7 12
7: 6 12 23 10 16
8: 7 14 23 10 16
9: 8 16 23 10 16