divide a range of values in bins of equal length: cut vs cut2

后端 未结 3 1841
时光说笑
时光说笑 2020-12-31 07:36

I\'m using the cut function to split my data in equal bins, it does the job but I\'m not happy with the way it returns the values. What I need is the center of the bin not t

3条回答
  •  天涯浪人
    2020-12-31 07:57

    It's not too hard to make the breaks and labels yourself, with something like this. Here since the midpoint is a single number, I don't actually return a factor with labels but instead a numeric vector.

    cut2 <- function(x, breaks) {
      r <- range(x)
      b <- seq(r[1], r[2], length=2*breaks+1)
      brk <- b[0:breaks*2+1]
      mid <- b[1:breaks*2]
      brk[1] <- brk[1]-0.01
      k <- cut(x, breaks=brk, labels=FALSE)
      mid[k]
    }
    

    There's probably a better way to get the bin breaks and midpoints; I didn't think about it very hard.

    Note that this answer is different than Joshua's; his gives the median of the data in each bins while this gives the center of each bin.

    > head(cut2(x,3))
    [1] 16.666667  3.333333 16.666667  3.333333 16.666667 16.666667
    > head(ave(x, cut(x,3), FUN=median))
    [1] 18  2 18  2 18 18
    

提交回复
热议问题