Equal frequency discretization in R

前端 未结 8 1552
南方客
南方客 2020-12-17 02:56

I\'m having trouble finding a function in R that performs equal-frequency discretization. I stumbled on the \'infotheo\' package, but after some testing I found that the al

8条回答
  •  北海茫月
    2020-12-17 03:49

    Here is a function that handle the error :'breaks' are not unique, and automatically select the closest n_bins value to the one you setted up.

    equal_freq <- function(var, n_bins)
    {
      require(ggplot2)
    
      n_bins_orig=n_bins
    
      res=tryCatch(cut_number(var, n = n_bins), error=function(e) {return (e)})
      while(grepl("'breaks' are not unique", res[1]) & n_bins>1)
      {
        n_bins=n_bins-1
        res=tryCatch(cut_number(var, n = n_bins), error=function(e) {return (e)})
    
      }
      if(n_bins_orig != n_bins)
        warning(sprintf("It's not possible to calculate with n_bins=%s, setting n_bins in: %s.", n_bins_orig, n_bins))
    
      return(res)
    }
    

    Example:

    equal_freq(mtcars$carb, 10)
    

    Which retrieves the binned variable and the following warning:

    It's not possible to calculate with n_bins=10, setting n_bins in: 5.
    

提交回复
热议问题