Remove the rows of data frame whose cells match a given vector

问题

I have big data frame with various numbers of columns and rows. I would to search the data frame for values of a given vector and remove the rows of the cells that match the values of this given vector. I'd like to have this as a function because I have to run it on multiple data frames of variable rows and columns and I wouls like to avoid for loops.

for example

ff<-structure(list(j.1 = 1:13, j.2 = 2:14, j.3 = 3:15), .Names = c("j.1","j.2", "j.3"), row.names = c(NA, -13L), class = "data.frame")

remove all rows that have cells that contain the values 8,9,10

I guess i could use ff[ !ff[,1] %in% c(8, 9, 10), ] or subset(ff, !ff[,1] %in% c(8,9,10) )

but in order to remove all the values from the dataset i have to parse each column (probably with a for loop, something i wish to avoid).

Is there any other (cleaner) way?

Thanks a lot

回答1:

apply your test to each row:

keeps <- apply(ff, 1, function(x) !any(x %in% 8:10))

which gives a boolean vector. Then subset with it:

ff[keeps,]

   j.1 j.2 j.3
1    1   2   3
2    2   3   4
3    3   4   5
4    4   5   6
5    5   6   7
11  11  12  13
12  12  13  14
13  13  14  15
>

回答2:

I suppose the apply strategy may turn out to be the most economical but one could also do either of these:

 ff[ !rowSums( sapply( ff, function(x) x %in% 8:10) ) , ]
ff[ !Reduce("+", lapply( ff, function(x) x %in% 8:10) ) , ]

Vector addition of logical vectors, (equivalent to any) followed by negation. I suspect the first one would be faster.

来源：https://stackoverflow.com/questions/11004203/remove-the-rows-of-data-frame-whose-cells-match-a-given-vector

标签

dataframe

subset