One is able to filter rows with dplyr with filter
, but the condition is usually based on specific columns per row such as
d <- data.frame(x=c
You could use rowSums
or rowMeans
for that. An example with the provided data:
> d
x y z
1 1 3 NA
2 2 NA 4
3 NA NA 5
# with rowSums:
d %>% filter(rowSums(is.na(.))/ncol(.) < 0.5)
# with rowMeans:
d %>% filter(rowMeans(is.na(.)) < 0.5)
which both give:
x y z
1 1 3 NA
2 2 NA 4
As you can see row 3 is removed from the data.
In base R, you could just do:
d[rowMeans(is.na(d)) < 0.5,]
to get the same result.