Let I have such data frame(df1) with factors:
factor1 factor2 factor3
------- ------- -------
d a x
d a x
b a
I would create a quick helper function that checks how many unique instances of each level exist with a quick call to table()
-- look at table(df$fac1)
to see how this works. Note this isn't very robust, but should get you started:
df <- data.frame(fac1 = factor(c("d", "d", "b", "b", "b", "c", "c", "c", "c")),
fac2 = factor(c("a", "a", "a", "c", "c", "c", "n", "n", "n")),
fac3 = factor(c(rep("x", 4), rep("y", 5))),
other = 1:9)
at_least_three_instances <- function(column) {
if (is.factor(column)) {
if (min(table(column)) > 2) {
return(TRUE)
} else {
return(FALSE)
}
} else {
return(TRUE)
}
}
df[unlist(lapply(df, at_least_three_instances))]