R - subset data frame - check if value lies in range

荒凉一梦 提交于 2020-01-06 05:45:07

问题


I have the following two data frames

d1 <- data.frame(chr = c("chr1","chr2","chr2"), pos = c(11, 15,21), type = c("type1","type2","type1"))

    > d1
    chr pos  type
 1 chr1  11 type1
 2 chr2  15 type2
 3 chr2  21 type1


d2 <- data.frame(chr = c("chr1","chr2","chr4"), start = c(10, 15,30), stop = c(13,20,40))

   > d2
   chr start stop
1 chr1    10   13
2 chr2    15   20
3 chr4    30   40

I want to subset d1 on two conditions:

  • keep all lines where 'type' == "type1" (I know how to do this)
  • keep all lines where 'chr' matches any of the lines in d2 and 'pos' falls between the 'start' and 'stop' values from that line in d2

The resulting d3 would in this case then only contain line 1 of d1:

    > d3
    chr pos  type
 1 chr1  11 type1

I would start like this:

 d3 <- subset(d1, d1$type == "type1" & ...)

回答1:


We can add all the conditions together into one logical condition to subset

d1[d1$type=="type1" & d1$chr %in% d2$chr & d1$pos >= d2$start & d1$pos <= d2$stop, ]

#   chr pos  type
#1 chr1  11 type1


来源:https://stackoverflow.com/questions/57266240/r-subset-data-frame-check-if-value-lies-in-range

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!