Using ifelse to remove unwanted rows from the dataset in R

◇◆丶佛笑我妖孽 提交于 2019-12-05 20:31:43

You don't even need the ifelse() if all you want is an indicator of which to remove or not.

ind <- (Month == "11") &
           ((ID == "1" & Year == "2006") | (ID == "2" & Year == "2007"))

ind will contain a TRUE if Month is "11" and if either of the other two subclauses is TRUE.

Then you can drop those sample using !ind in any subset operation via [ or subset().

dat <- data.frame(ID = rep(c("1","2"), each = 72),
                  Year = rep(c("2006","2007","2008"), each = 24),
                  Month = rep(as.character(1:12), times = 3))
ind <- with(dat, (Month == "11") & ((ID == "1" & Year == "2006") |
                                    (ID == "2" & Year == "2007")))
ind
dat2 <- dat[!ind, ]

Which gives

R> ind
  [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE FALSE
 [13] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE FALSE
 [25] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
 [37] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
 [49] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
 [61] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
 [73] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
 [85] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
 [97] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE FALSE
[109] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE FALSE
[121] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[133] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
R>     dat2 <- dat[!ind, ]
R> nrow(dat)
[1] 144
R> nrow(dat2)
[1] 140

which is correct in terms of the example data/

mnel

A data.table solution, which will be time and memory efficient (and slightly less coding). It will scale well for big data sets.

If the columns were integer, not factor

library(data.table)
DT <- data.table(ID = rep(1:2, each = 72),
          Year = rep(2006:2008, each = 24),
          Month = rep(1:12, times = 3))
# or you could use:   DT <- as.data.table(dat)
setkey(DT,ID,Year,Month)
DT[-DT[J(1:2,2006:2007,11),which=TRUE]]
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!