How to enforce ggplot's position_dodge on categories with no data?

后端 未结 3 1568
孤街浪徒
孤街浪徒 2020-12-30 01:15

I\'m trying to use position_dodge on ggplot to obtain boxplots of two different signals (ind) sharing the same categories (cat). When there is a category with data for one

相关标签:
3条回答
  • 2020-12-30 01:43

    After some workarounds, I came up with the outcome I was looking for... (kind of)

    data            <- data.frame(
    cat=c('A','A','A','A','B','B','A','A','A','A','B','B','B'), 
    values=c(3,2,1,4,NA,NA,4,5,6,7,8,9, 0), 
    ind=c('x','x','x','x','x','x','y','y','y','y','y','y','x'))
    
    p  <- ggplot() +
          scale_colour_hue(guide='none') +
          geom_boxplot(aes(x=as.factor(cat), y=values, fill=ind),
          position=position_dodge(width=.60), 
          data=data,
          outlier.size = 1.2,
          na.rm=T) +
          geom_line(aes(x=x, y=y), 
                    data=data.frame(x=c(0,3),y=rep(0,2)), 
                    size = 1, 
                    col='white')
    print(p)
    

    solution with workaround

    Some people recommended using faceting for the effect I wanted. Faceting doesn't give me the effect I'm looking for. The final graph I was looking for is shown below:

    final graph

    If you notice, the white major tick mark at y = 10 is thicker than the other tick marks. This thicker line is the geom_line with size=1 that hides unwanted boxplots.

    I wish we could combine different geom objects more seamlessly. I reported this as a bug on Hadley's github, but Hadley said this is how position_dodge behaves by design. I guess I'm using ggplot2 in a non-standard way and workarounds are the way to go on these kind of issues. Anyways, I hope this helps some of the R folks to push ggplot great functionality a little further.

    0 讨论(0)
  • 2020-12-30 01:58

    I just got a clue to use faceting from one of the comments posted by Hadley at his git site, so credits goes to Hadley, the maintainer of ggplot2 package!

    See if this is what you wanted. To learn more about options on setting the whiskers and others in this plot, check this help page in ggplot2 package:

    ?stat_boxplot
    
    data<-data.frame(cat=c('A','A','A','A','B','B','A','A','A','A','B','B'), 
                 values=c(3,2,1,4,NA,NA,4,5,6,7,8,9), 
                 ind=c('x','x','x','x','x','x','y','y','y','y','y','y'))
    
    p <- ggplot(data = data, aes(factor(cat), values))                     
    p + stat_boxplot(geom="boxplot", position = "dodge", width = 0.60, na.rm = TRUE) +  facet_grid(.~ind)
    

    enter image description here

    To add colors to your plot, which in my opinion is a redundant one as you are already faceting the plot based on "ind" variable, try this:

    p <- ggplot(data, aes(factor(cat), values, fill = ind))                     
    p + stat_boxplot(geom="boxplot", position = "dodge", width = 0.60, na.rm = TRUE) + facet_grid(.~ind)
    

    enter image description here

    HTH!

    0 讨论(0)
  • 2020-12-30 02:06

    x of B has no values, so you can add "B", 0, "x" which essentially indicates that there is no distribution of "values" for x of B. The median and other percentiles are zero.

     data<-data.frame(cat=c('A','A','A','A','B','B','A','A','A','A','B','B','B'), 
                 values=c(3,2,1,4,NA,NA,4,5,6,7,8,9,0), 
                 ind=c('x','x','x','x','x','x','y','y','y','y','y','y','x'))
    

    Also you do not have to add position parameters here, because when you consider x as a factor, ggplot -- geom_boxplot will automagically dodge to the sides.

    print(ggplot() +
      scale_colour_hue(guide='none') +
      geom_boxplot(aes(x=as.factor(cat), y=values, fill=ind), 
      data=data,
      outlier.size = 1.2,
      na.rm=T))
    

    0 讨论(0)
提交回复
热议问题