Plotting cumulative counts in ggplot2

前端 未结 2 576
爱一瞬间的悲伤
爱一瞬间的悲伤 2020-12-23 20:58

There are some posts about plotting cumulative densities in ggplot. I\'m currently using the accepted answer from Easier way to plot the cumulative frequency distribution in

2条回答
  •  庸人自扰
    2020-12-23 21:13

    This will not solve directly problem with grouping of lines but it will be workaround.

    You can add three calls to stat_bin() where you subset your data according to A levels.

    ggplot(x,aes(x=X,color=A)) +
      stat_bin(data=subset(x,A=="a"),aes(y=cumsum(..count..)),geom="step")+
      stat_bin(data=subset(x,A=="b"),aes(y=cumsum(..count..)),geom="step")+
      stat_bin(data=subset(x,A=="c"),aes(y=cumsum(..count..)),geom="step")
    

    enter image description here

    UPDATE - solution using geom_step()

    Another possibility is to multiply values of ..y.. with number of observations in each level. To get this number of observations at this moment only way I found is to precalculate them before plotting and add them to original data frame. I named this column len. Then in geom_step() inside aes() you should define that you will use variable len=len and then define y values as y=..y.. * len.

    set.seed(123)
    x <- data.frame(A=replicate(200,sample(c("a","b","c"),1)),X=rnorm(200))
    library(plyr)
    df <- ddply(x,.(A),transform,len=length(X))
    ggplot(df,aes(x=X,color=A)) + geom_step(aes(len=len,y=..y.. * len),stat="ecdf") 
    

    enter image description here

提交回复
热议问题