How to replace outliers with the 5th and 95th percentile values in R

前端 未结 4 1093
野性不改
野性不改 2020-12-30 11:26

I\'d like to replace all values in my relatively large R dataset which take values above the 95th and below the 5th percentile, with those percentile values

4条回答
  •  爱一瞬间的悲伤
    2020-12-30 12:05

    You can do it in one line of code using squish():

    d2 <- squish(d, quantile(d, c(.05, .95)))
    



    In the scales library, look at ?squish and ?discard

    #--------------------------------
    library(scales)
    
    pr <- .95
    q  <- quantile(d, c(1-pr, pr))
    d2 <- squish(d, q)
    #---------------------------------
    
    # Note: depending on your needs, you may want to round off the quantile, ie:
    q <- round(quantile(d, c(1-pr, pr)))
    

    example:

    d <- 1:20
    d
    # [1]  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20
    
    
    d2 <- squish(d, round(quantile(d, c(.05, .95))))
    d2
    # [1]  2  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 19
    

提交回复
热议问题