Tukeys post-hoc on ggplot boxplot

心已入冬 提交于 2019-12-17 20:01:49

问题


Ok, so I think I'm pretty close with this, but I'm getting an error when I try to construct my box plot at the end. My goal is to place letters denoting statistical relationships among the time points above each boxplot. I've seen two discussion of this on this site, and can reproduce the results from their code, but can't apply it to my dataset.

Packages

library(ggplot2)
library(multcompView)
library(plyr)

Here is my data:

dput(WaterConDryMass)
structure(list(ChillTime = structure(c(1L, 1L, 1L, 1L, 2L, 2L, 
2L, 2L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 5L, 5L, 5L, 5L), .Label = c("Pre_chill", 
"6", "13", "24", "Post_chill"), class = "factor"), dmass = c(0.22, 
0.19, 0.34, 0.12, 0.23, 0.33, 0.38, 0.15, 0.31, 0.34, 0.45, 0.48, 
0.59, 0.54, 0.73, 0.69, 0.53, 0.57, 0.39, 0.8)), .Names = c("ChillTime", 
"dmass"), row.names = c(NA, -20L), class = "data.frame")

ANOVA and Tukey Post-hoc

Model4 <- aov(dmass~ChillTime, data=WaterConDryMass)
tHSD <- TukeyHSD(Model4, ordered = FALSE, conf.level = 0.95)
plot(tHSD , las=1 , col="brown" )

Function:

generate_label_df <- function(TUKEY, flev){

  # Extract labels and factor levels from Tukey post-hoc 
  Tukey.levels <- TUKEY[[flev]][,4]
  Tukey.labels <- multcompLetters(Tukey.levels)['Letters']
  plot.labels <- names(Tukey.labels[['Letters']])

  boxplot.df <- ddply(WaterConDryMass, flev, function (x) max(fivenum(x$y)) + 0.2)

  # Create a data frame out of the factor levels and Tukey's homogenous group letters
  plot.levels <- data.frame(plot.labels, labels = Tukey.labels[['Letters']],
                            stringsAsFactors = FALSE) 

  # Merge it with the labels
  labels.df <- merge(plot.levels, boxplot.df, by.x = 'plot.labels', by.y = flev, sort = FALSE)
  return(labels.df)
}  

Boxplot:

ggplot(WaterConDryMass, aes(x = ChillTime, y = dmass)) +
  geom_blank() +
  theme_bw() +
  theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank()) +
  labs(x = 'Time (weeks)', y = 'Water Content (DM %)') +
  ggtitle(expression(atop(bold("Water Content"), atop(italic("(Dry Mass)"), "")))) +
  theme(plot.title = element_text(hjust = 0.5, face='bold')) +
  annotate(geom = "rect", xmin = 1.5, xmax = 4.5, ymin = -Inf, ymax = Inf, alpha = 0.6, fill = "grey90") +
  geom_boxplot(fill = 'green2', stat = "boxplot") +
  geom_text(data = generate_label_df(tHSD), aes(x = plot.labels, y = V1, label = labels)) +
  geom_vline(aes(xintercept=4.5), linetype="dashed") +
  theme(plot.title = element_text(vjust=-0.6))

Error:

Error in HSD[[flev]] : invalid subscript type 'symbol'

回答1:


I think I found the tutorial you are following, or something very similar. You would probably be best to copy and paste this whole thing into your work space, function and all, to avoid missing a few small differences.

Basically I have followed the tutorial (http://www.r-graph-gallery.com/84-tukey-test/) to the letter and added a few necessary tweaks at the end. It adds a few extra lines of code, but it works.

generate_label_df <- function(TUKEY, variable){

  # Extract labels and factor levels from Tukey post-hoc 
  Tukey.levels <- TUKEY[[variable]][,4]
  Tukey.labels <- data.frame(multcompLetters(Tukey.levels)['Letters'])

  #I need to put the labels in the same order as in the boxplot :
  Tukey.labels$treatment=rownames(Tukey.labels)
  Tukey.labels=Tukey.labels[order(Tukey.labels$treatment) , ]
  return(Tukey.labels)
}

model=lm(WaterConDryMass$dmass~WaterConDryMass$ChillTime )
ANOVA=aov(model)

# Tukey test to study each pair of treatment :
TUKEY <- TukeyHSD(x=ANOVA, 'WaterConDryMass$ChillTime', conf.level=0.95)

labels<-generate_label_df(TUKEY , "WaterConDryMass$ChillTime")#generate labels using function

names(labels)<-c('Letters','ChillTime')#rename columns for merging

yvalue<-aggregate(.~ChillTime, data=WaterConDryMass, mean)# obtain letter position for y axis using means

final<-merge(labels,yvalue) #merge dataframes

ggplot(WaterConDryMass, aes(x = ChillTime, y = dmass)) +
  geom_blank() +
  theme_bw() +
  theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank()) +
  labs(x = 'Time (weeks)', y = 'Water Content (DM %)') +
  ggtitle(expression(atop(bold("Water Content"), atop(italic("(Dry Mass)"), "")))) +
  theme(plot.title = element_text(hjust = 0.5, face='bold')) +
  annotate(geom = "rect", xmin = 1.5, xmax = 4.5, ymin = -Inf, ymax = Inf, alpha = 0.6, fill = "grey90") +
  geom_boxplot(fill = 'green2', stat = "boxplot") +
  geom_text(data = final, aes(x = ChillTime, y = dmass, label = Letters),vjust=-3.5,hjust=-.5) +
  geom_vline(aes(xintercept=4.5), linetype="dashed") +
  theme(plot.title = element_text(vjust=-0.6))



来源:https://stackoverflow.com/questions/44712185/tukeys-post-hoc-on-ggplot-boxplot

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!