Trying to collapse a nominal categorical vector by combining low frequency counts into an \'Other\' category:
The data (column of a dataframe) looks like this, and c
Using the package dplyr, and assuming your data frame (let's call it State) has one field called ID for each State name...
filtered_data <- State %>% group_by(ID) %>% summarise(n = n(),
freq = n/nrow(State),
above_thresh = freq > 0.2)
filtered_data$State[filtered_data$above_thres == TRUE] <- "above_0.2"
effectively what this does is gives the state name of anything with a frequency of 0.2, the label "above_0.2".