Cut() error - 'breaks' are not unique

后端 未结 4 1189
甜味超标
甜味超标 2020-11-29 04:12

I have following dataframe:

 a         
    ID   a.1    b.1     a.2   b.2
1    1  40.00   100.00  NA    88.89
2    2  100.00  100.00  100   100.00
3    3  5         


        
4条回答
  •  悲哀的现实
    2020-11-29 04:36

    You get this error because quantile values in your data for columns b.1, a.2 and b.2 are the same for some levels, so they can't be directly used as breaks values in function cut().

    apply(a,2,quantile,na.rm=T)
           ID      a.1    b.1   a.2      b.2
    0%   1.00  37.5000  59.38  75.0  59.3800
    25%  2.25  42.5000 100.00  87.5  91.6675
    50%  3.50  58.3350 100.00 100.0 100.0000
    75%  4.75  91.6675 100.00 100.0 100.0000
    100% 6.00 100.0000 100.00 100.0 100.0000
    

    One way to solve this problem would be to put quantile() inside unique() function - so you will remove all quantile values that are not unique. This of course will make less breaking points if quantiles are not unique.

    res <- lapply(dup.temp[,1],function(i) {
      breaks <- c(-Inf,unique(quantile(a[,paste(i,1,sep=".")], na.rm=T)),Inf)
      cut(a[,paste(i,2,sep=".")],breaks)
    })
    
    [[1]]
    [1]         (91.7,100]  (58.3,91.7]                 (91.7,100] 
    Levels: (-Inf,37.5] (37.5,42.5] (42.5,58.3] (58.3,91.7] (91.7,100] (100, Inf]
    
    [[2]]
    [1] (59.4,100]  (59.4,100]  (59.4,100]  (-Inf,59.4] (59.4,100]  (59.4,100] 
    Levels: (-Inf,59.4] (59.4,100] (100, Inf]
    

提交回复
热议问题