R/ggplot Cumulative Sum in Histogram

前端 未结 1 660
北荒
北荒 2020-12-11 12:13

I have a dataset with user IDs and the number of objects they created. I drew the histogram using ggplot and now I\'m trying to include the cumulative sum of the x-values as

相关标签:
1条回答
  • 2020-12-11 13:12

    Here is an illustrative example that could be helpful for you.

    set.seed(111)
    userID <- c(1:100)
    Num_Tours <- sample(1:100, 100, replace=T)
    userStats <- data.frame(userID, Num_Tours)
    
    # Sorting x data
    userStats$Num_Tours <- sort(userStats$Num_Tours)
    userStats$cumulative <- cumsum(userStats$Num_Tours/sum(userStats$Num_Tours))
    
    library(ggplot2)
    # Fix manually the maximum value of y-axis
    ymax <- 40
    ggplot(data=userStats,aes(x=Num_Tours)) + 
       geom_histogram(binwidth = 0.2, col="white")+
       scale_x_log10(name = 'Number of planned tours',breaks=c(1,5,10,50,100,200))+
       geom_line(aes(x=Num_Tours,y=cumulative*ymax), col="red", lwd=1)+
       scale_y_continuous(name = 'Number of users', sec.axis = sec_axis(~./ymax, 
        name = "Cumulative percentage of routes [%]"))
    

    0 讨论(0)
提交回复
热议问题