I use factors somewhat infrequently and generally find them comprehensible, but I often am fuzzy about the details for specific operations. Currently, I am coding/collapsing cat
A late entry
Here is a wrapper for plyr::mapvalues which allows the a remaining argument (your other)
library(plyr)
Mapvalues <- function(x, from, to, warn_missing= TRUE, remaining = NULL){
if(!is.null(remaining)){
therest <- setdiff(x, from)
from <- c(from, therest)
to <- c(to, rep_len(remaining, length(therest)))
}
mapvalues(x, from, to, warn_missing)
}
# replace the remaining values with "other"
Mapvalues(data$naics, top8, top8_desc,remaining = 'other')
# leave the remaining values alone
Mapvalues(data$naics, top8, top8_desc)