Getting frequency values from histogram in R

后端 未结 3 597
误落风尘
误落风尘 2020-12-13 00:18

I know how to draw histograms or other frequency/percentage related tables. But now I want to know, how can I get those frequency values in a table to use after the fact.

3条回答
  •  爱一瞬间的悲伤
    2020-12-13 00:49

    Just in case someone hits this question with ggplot's geom_histogram in mind, note that there is a way to extract the data from a ggplot object.

    The following convenience function outputs a dataframe with the lower limit of each bin (xmin), the upper limit of each bin (xmax), the mid-point of each bin (x), as well as the frequency value (y).

    ## Convenience function
    get_hist <- function(p) {
        d <- ggplot_build(p)$data[[1]]
        data.frame(x = d$x, xmin = d$xmin, xmax = d$xmax, y = d$y)
    }
    
    # make a dataframe for ggplot
    set.seed(1)
    x = runif(100, 0, 10)
    y = cumsum(x)
    df <- data.frame(x = sort(x), y = y)
    
    # make geom_histogram 
    p <- ggplot(data = df, aes(x = x)) + 
        geom_histogram(aes(y = cumsum(..count..)), binwidth = 1, boundary = 0,
                    color = "black", fill = "white")
    

    Illustration:

    hist = get_hist(p)
    head(hist$x)
    ## [1] 0.5 1.5 2.5 3.5 4.5 5.5
    head(hist$y)
    ## [1]  7 13 24 38 52 57
    head(hist$xmax)
    ## [1] 1 2 3 4 5 6
    head(hist$xmin)
    ## [1] 0 1 2 3 4 5
    

    A related question I answered here (Cumulative histogram with ggplot2).

提交回复
热议问题