Equal frequency discretization in R

前端 未结 8 1550
南方客
南方客 2020-12-17 02:56

I\'m having trouble finding a function in R that performs equal-frequency discretization. I stumbled on the \'infotheo\' package, but after some testing I found that the al

相关标签:
8条回答
  • 2020-12-17 03:52

    This sort of thing is also quite easily solved by using (abusing?) the conditioning plot infrastructure from lattice, in particular function co.intervals():

    cutEqual <- function(x, n, include.lowest = TRUE, ...) {
        stopifnot(require(lattice))
        cut(x, co.intervals(x, n, 0)[c(1, (n+1):(n*2))], 
            include.lowest = include.lowest, ...)
    }
    

    Which reproduces @Joris' excellent answer:

    > set.seed(12345)
    > x <- rnorm(50)
    > table(cutEqual(x, 5))
    
     [-2.38,-0.885] (-0.885,-0.115]  (-0.115,0.587]   (0.587,0.938]     (0.938,2.2] 
                 10              10              10              10              10
    > y <- rpois(50, 5)
    > table(cutEqual(y, 5))
    
     [0.5,3.5]  (3.5,5.5]  (5.5,6.5]  (6.5,7.5] (7.5,11.5] 
            10         13         11          6         10
    

    In the latter, discrete, case the breaks are different although they have the same effect; the same observations are in the same bins.

    0 讨论(0)
  • 2020-12-17 03:54

    Here's another solution using mltools.

    set.seed(1)
    x <- round(rnorm(20), 2)
    x.binned <- mltools::bin_data(x, bins = 5, binType = "quantile")
    table(x.binned)
    
    x.binned
    [-2.21, -0.622)   [-0.622, 0.1)    [0.1, 0.526)  [0.526, 0.844)    [0.844, 1.6] 
                  4               4               4               4               4 
    
    0 讨论(0)
提交回复
热议问题