ggplot using second data source for error bars fails

我与影子孤独终老i 提交于 2020-04-11 05:23:22

问题


This is a follow-on to a previous question about getting some custom error bars.

  1. The look of the plot is how I need it, so don't worry about commenting in solely in regards to that (happy to hear opinions attached to other help though)
  2. Because these plots are generated in a loop, and the error bars are actually only added if a condition is met, I cant simply merge all the data up front, so assume for the purpose of this exercise the plot data and errorbar data are from different dfs.

I have a ggplot, to which I attempt to add some error bars using a different dataframe. When I call the plot, it says that it cannot find the y values from the parent plot, even though I'm just trying to add error bars using new data. I know this has to be a syntax error but I am stumped...

First lets generate data and the plot

library(ggplot2)
library(scales)

# some data
data.2015 = data.frame(score = c(-50,20,15,-40,-10,60),
                       area = c("first","second","third","first","second","third"),
                       group = c("Findings","Findings","Findings","Benchmark","Benchmark","Benchmark"))

data.2014 = data.frame(score = c(-30,40,-15),
                       area = c("first","second","third"),
                       group = c("Findings","Findings","Findings"))

# breaks and limits
breaks.major = c(-60,-40,-22.5,-10, 0,10, 22.5, 40, 60)
breaks.minor = c(-50,-30,-15,-5,0, 5, 15,30,50) 
limits =c(-70,70)

# plot 2015 data
ggplot(data.2015, aes(x = area, y = score, fill = group)) +
  geom_bar(stat = "identity", position = position_dodge(width = 0.9)) +
  coord_flip() +
  scale_y_continuous(limit = limits, oob = squish, minor_breaks = breaks.minor, 
                     breaks = breaks.major)

Calling the plot (c) produces a nice plot as expected, now lets set up the error bars and attempt to add them as a new layer in the plot "c"

# get the error bar values
alldat = merge(data.2015, data.2014, all = TRUE, by = c("area", "group"), 
               suffixes = c(".2015", ".2014"))
alldat$plotscore = with(alldat, ifelse(is.na(score.2014), NA, score.2015))
alldat$direction = with(alldat, ifelse(score.2015 < score.2014, "dec", "inc"))
alldat$direction[is.na(alldat$score.2014)] = "absent"

#add error bars to original plot
c <- c+
  geom_errorbar(data=alldat, aes(ymin = plotscore, ymax = score.2014, color = direction), 
                position = position_dodge(width = .9), lwd = 1.5, show.legend = FALSE)

When I call c now, I get

"Error in eval(expr, envir, enclos) : object 'score' not found"

Why does it look for data.2015$score when I just want it to overlay the geom_errorbar using the second alldat dataframe?

EDIT* I've tried to specify the ymin/ymax values for the error bars using alldata$plotscore and alldat$score.2014 (which I am sure is bad practice), it plots, but the bars are in the wrong positions/out of order with the plot (e.g. swapped around, on the benchmark bars instead, etc.)


回答1:


In my experience, this error about some variable not being found tells me that R went to look in a data.frame for a variable and it wasn't there. Sometimes the solution is as simple as fixing a typo, but in your case the score variable isn't in the dataset you used to make your error bars.

names(alldat)
[1] "area"       "group"      "score.2015" "score.2014" "plotscore"  "direction"

The y variable is a required aesthetic for geom_errorbar. Because you set a y variable globally within ggplot, the other geoms inherit the global y unless you specifically map it to a different variable. In the current dataset, you'll need map y to the 2015 score variable.

geom_errorbar(data=alldat, aes(y = score.2015, ymin = plotscore, 
                               ymax = score.2014, color = direction), 
              position = position_dodge(width = .9), lwd = 1.5, show.legend = FALSE)

In your comment you indicated you also had to add fill to geom_errobar, as well, but I didn't find that necessary when I ran the code (you can see above that group is a variable in the second dataset in the example you give).

The other option would be to make sure the 2015 score variable is still named score after merging. This can be done by changing the suffixes argument in in merge. Then score will be in the second dataset and you won't have to set your y variable in geom_errorbar.

alldat2 = merge(data.2015, data.2014, all = TRUE, by = c("area", "group"), 
            suffixes = c("", ".2014"))
...
names(alldat2)
[1] "area"       "group"      "score"      "score.2014" "plotscore"  "direction" 


来源:https://stackoverflow.com/questions/32108269/ggplot-using-second-data-source-for-error-bars-fails

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!