Plotting cumulative counts in ggplot2

前端 未结 2 575
爱一瞬间的悲伤
爱一瞬间的悲伤 2020-12-23 20:58

There are some posts about plotting cumulative densities in ggplot. I\'m currently using the accepted answer from Easier way to plot the cumulative frequency distribution in

相关标签:
2条回答
  • 2020-12-23 21:12

    You can apply row_number over the groups, and utilize that as the Y aesthetic in a geom_step or other geometry. You'll just have to sort by X, or the values will appear as they do in the data frame, unordered.

    ggplot(x %>% 
             group_by(A) %>% 
             arrange(X) %>% 
             mutate(rn = row_number())) + 
      geom_step(aes(x=X, y=rn, color=A))
    

    0 讨论(0)
  • 2020-12-23 21:13

    This will not solve directly problem with grouping of lines but it will be workaround.

    You can add three calls to stat_bin() where you subset your data according to A levels.

    ggplot(x,aes(x=X,color=A)) +
      stat_bin(data=subset(x,A=="a"),aes(y=cumsum(..count..)),geom="step")+
      stat_bin(data=subset(x,A=="b"),aes(y=cumsum(..count..)),geom="step")+
      stat_bin(data=subset(x,A=="c"),aes(y=cumsum(..count..)),geom="step")
    

    enter image description here

    UPDATE - solution using geom_step()

    Another possibility is to multiply values of ..y.. with number of observations in each level. To get this number of observations at this moment only way I found is to precalculate them before plotting and add them to original data frame. I named this column len. Then in geom_step() inside aes() you should define that you will use variable len=len and then define y values as y=..y.. * len.

    set.seed(123)
    x <- data.frame(A=replicate(200,sample(c("a","b","c"),1)),X=rnorm(200))
    library(plyr)
    df <- ddply(x,.(A),transform,len=length(X))
    ggplot(df,aes(x=X,color=A)) + geom_step(aes(len=len,y=..y.. * len),stat="ecdf") 
    

    enter image description here

    0 讨论(0)
提交回复
热议问题