Control column widths in a ggplot2 graph with a series and inconsistent data

邮差的信 提交于 2019-12-28 04:27:27

问题


In the artificial data I have created for the MWE below I have tried to demonstrate the essence of a script I have created in R. As can be seen by the graph that gets produced from this code, on one of my conditions I don't have a "No" value to complete the series.

I have been told that unless I can make this last column that sadly doesn't have the extra series as thin as the columns else where in the graph I won't be permitted to use these graphs. This is sadly a problem because the script I have written produces hundreds of graphs simultaneously, complete with stats, significance indicators, propogated error bars, and intelligent y-axis adjustments (these features are of course not present in the MWE).

Few other comments:

  • This exception column is not guaranteed to be at the end of the graph... so manual tweaking to force the series to change color and invert the order leaving the extra space on the right hand side isn't reliable.

  • I have tried to simulate the data as a constant 0 so that the series "is present" but invisible, but as would be expected, the order of the series c(No,Yes) makes this skip a space which is also unacceptable. This is how this same question was answered here, but sadly it doesn't work for me with my restrictions: Consistent width for geom_bar in the event of missing data and Include space for missing factor level used in fill aesthetics in geom_boxplot

  • I also tried to do this with facets but numerous issues arose there including line breaks, and errors in the annotations I add to the x-axis.

MWE:

library(ggplot2)

print("Program started")

x <- c("1","2","3","1","2","3","4")
s <- c("No","No","No","Yes","Yes","Yes","Yes")
y <- c(1,2,3,2,3,4,5)
df <- as.data.frame(cbind(x,s,y))

print(df)

gg <- ggplot(data = df, aes_string(x="x", y="y", weight="y", ymin=paste0("y"), ymax=paste0("y"), fill="s"));
dodge_str <- position_dodge(width = NULL, height = NULL);
gg <- gg + geom_bar(position=dodge_str, stat="identity", size=.3, colour = "black")

print(gg)

print("Program complete - a graph should be visible.")

回答1:


At the expense of doing your own calculation for the x coordinates of the bars as shown below, you can get a chart which may be close to what you're looking for.

x <- c("1","2","3","1","2","3","4")
s <- c("No","No","No","Yes","Yes","Yes","Yes")
y <- c(1,2,3,2,3,4,5)
df <- data.frame(cbind(x,s,y) )
df$x_pos[order(df$x, df$s)] <- 1:nrow(df)
x_stats <- as.data.frame.table(table(df$x), responseName="x_counts")
x_stats$center <- tapply(df$x_pos, df$x, mean)
df <-  merge(df, x_stats, by.x="x", by.y="Var1", all=TRUE)
bar_width <- .7
df$pos <- apply(df, 1, function(x) {xpos=as.numeric(x[4]) 
                                if(x[5] == 1) xpos 
                                else ifelse(x[2]=="No", xpos + .5 -        bar_width/2, xpos - .5 + bar_width/2) } )
 print(df)
gg <- ggplot(data=df, aes(x=pos, y=y, fill=s ) )
gg <- gg + geom_bar(position="identity", stat="identity", size=.3,    colour="black", width=bar_width)
gg <- gg + scale_x_continuous(breaks=df$center,labels=df$x )
plot(gg)

----- edit --------------------------------------------------

Modified to place the labels at the center of bars.

Gives the following chart




回答2:


Yeah, I figured what happened: you need to be extra careful about factors being factors and numerics being numerics. In my case, with stringsAsFactors = FALSE I have

str(df)
'data.frame':   7 obs. of  3 variables:
 $ x: chr  "1" "2" "3" "1" ...
 $ s: chr  "No" "No" "No" "Yes" ...
 $ y: chr  "1" "2" "3" "2" ...

dput(df)
structure(list(x = c("1", "2", "3", "1", "2", "3", "4"), s = c("No", 
"No", "No", "Yes", "Yes", "Yes", "Yes"), y = c("1", "2", "3", 
"2", "3", "4", "5")), .Names = c("x", "s", "y"), row.names = c(NA, 
-7L), class = "data.frame")

with no factors and numeric turned into character because of cbind-ing (sic!). Let us have another data frame:

dff <- data.frame(x = factor(df$x), s = factor(df$s), y = as.numeric(df$y))

Adding a "dummy" row (manually for your example, check out expand.grid version in the linked question on how to do this automatically):

dff <- rbind(dff, c(4, "No", NA))

Plotting (I removed extra aes):

ggplot(data = df3, aes(x, y, fill=s)) + 
  geom_bar(position=dodge_str, stat="identity", size=.3, colour="black")



来源:https://stackoverflow.com/questions/28890227/control-column-widths-in-a-ggplot2-graph-with-a-series-and-inconsistent-data

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!