R: how to remove certain rows in data.frame

∥☆過路亽.° 提交于 2019-12-05 02:10:40

问题


> data = data.frame(a = c(100, -99, 322, 155, 256), b = c(23, 11, 25, 25, -999))
> data
    a    b
1 100   23
2 -99   11
3 322   25
4 155   25
5 256 -999

For such a data.frame I would like to remove any row that contains -99 or -999. So my resulting data.frame should only consist of rows 1, 3, and 4.

I was thinking of writing a loop for this, but I am hoping there's an easier way. (If my data.frame were to have columns a-z, then the loop method would be very clunky). My loop would probably look something like this

i = 1
for(i in 1:nrow(data)){
  if(data$a[i] < 0){
    data = data[-i,]
  }else if(data$b[i] < 0){
    data = data[-i,]
  }else data = data
}

回答1:


Maybe this:

ind <- Reduce(`|`,lapply(data,function(x) x %in% c(-99,-999)))
> data[!ind,]
    a  b
1 100 23
3 322 25
4 155 25



回答2:


 data [ rowSums(data == -99 | data==-999) == 0 , ]
    a  b
1 100 23
3 322 25
4 155 25

Both the "==" and the "|" (OR) operators act on dataframes as matrices, returning a logical object of the same dimensions so rowSums can succeed.




回答3:


@rawr's comment probably makes the most sense to do this during importing. Nevertheless, you can do similar if you already have the data:

na.omit(replace(data, sapply(data,`%in%`,c(-99,-999)), NA))
#    a  b
#1 100 23
#3 322 25
#4 155 25


来源:https://stackoverflow.com/questions/31304723/r-how-to-remove-certain-rows-in-data-frame

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!