I\'m having trouble finding a function in R that performs equal-frequency discretization. I stumbled on the \'infotheo\' package, but after some testing I found that the al
Here is a function that handle the error :'breaks' are not unique
, and automatically select the closest n_bins
value to the one you setted up.
equal_freq <- function(var, n_bins)
{
require(ggplot2)
n_bins_orig=n_bins
res=tryCatch(cut_number(var, n = n_bins), error=function(e) {return (e)})
while(grepl("'breaks' are not unique", res[1]) & n_bins>1)
{
n_bins=n_bins-1
res=tryCatch(cut_number(var, n = n_bins), error=function(e) {return (e)})
}
if(n_bins_orig != n_bins)
warning(sprintf("It's not possible to calculate with n_bins=%s, setting n_bins in: %s.", n_bins_orig, n_bins))
return(res)
}
Example:
equal_freq(mtcars$carb, 10)
Which retrieves the binned variable and the following warning:
It's not possible to calculate with n_bins=10, setting n_bins in: 5.