Remove the rows of data frame whose cells match a given vector

一笑奈何 提交于 2019-12-21 06:16:47

问题


I have big data frame with various numbers of columns and rows. I would to search the data frame for values of a given vector and remove the rows of the cells that match the values of this given vector. I'd like to have this as a function because I have to run it on multiple data frames of variable rows and columns and I wouls like to avoid for loops.

for example

ff<-structure(list(j.1 = 1:13, j.2 = 2:14, j.3 = 3:15), .Names = c("j.1","j.2", "j.3"), row.names = c(NA, -13L), class = "data.frame")

remove all rows that have cells that contain the values 8,9,10

I guess i could use ff[ !ff[,1] %in% c(8, 9, 10), ] or subset(ff, !ff[,1] %in% c(8,9,10) )

but in order to remove all the values from the dataset i have to parse each column (probably with a for loop, something i wish to avoid).

Is there any other (cleaner) way?

Thanks a lot


回答1:


apply your test to each row:

keeps <- apply(ff, 1, function(x) !any(x %in% 8:10))

which gives a boolean vector. Then subset with it:

ff[keeps,]

   j.1 j.2 j.3
1    1   2   3
2    2   3   4
3    3   4   5
4    4   5   6
5    5   6   7
11  11  12  13
12  12  13  14
13  13  14  15
> 



回答2:


I suppose the apply strategy may turn out to be the most economical but one could also do either of these:

 ff[ !rowSums( sapply( ff, function(x) x %in% 8:10) ) , ]
ff[ !Reduce("+", lapply( ff, function(x) x %in% 8:10) ) , ]

Vector addition of logical vectors, (equivalent to any) followed by negation. I suspect the first one would be faster.



来源:https://stackoverflow.com/questions/11004203/remove-the-rows-of-data-frame-whose-cells-match-a-given-vector

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!