How to fill geom_polygon with different colors above and below y = 0?

对着背影说爱祢 提交于 2019-11-26 20:56:11

Here's a possibility adapted from @kohske's answer here. All credits to him. Additional data points are generated by linear interpolation, and plot is made by geom_area.

First, a smaller example to make it easier to get a feeling for the linear interpolation and which points are added to the original data:

# original data
d <- data.frame(x = c(1:6),
                y = c(-1, 2, 1, 2, -1, 1))

# add a grouping variable just to keep track of original and interpolated points
d$grp <- "orig"

# create interpolated points
d <- d[order(d$x),]

new_d <- do.call("rbind",
              sapply(1:(nrow(d) -1), function(i){
                f <- lm(x ~ y, d[i:(i+1), ])
                if (f$qr$rank < 2) return(NULL)
                r <- predict(f, newdata = data.frame(y = 0))
                if(d[i, ]$x < r & r < d[i+1, ]$x)
                  return(data.frame(x = r, y = 0))
                else return(NULL)
              })
)

new_d$grp <- "new"

# combine original and interpolated data
d2 <- rbind(d, new_d)
d2   
#           x  y  grp
# 1  1.000000 -1 orig
# 2  2.000000  2 orig
# 3  3.000000  1 orig
# 4  4.000000  2 orig
# 5  5.000000 -1 orig
# 6  6.000000  1 orig
# 13 1.333333  0  new
# 11 4.666667  0  new
# 12 5.500000  0  new

# similar plot as below, but points are added, with different color (original vs new)
ggplot(data = d2, aes(x = x, y = y)) +
  geom_area(data = subset(d2, y <= 0), fill = "red", alpha = 0.2) +
  geom_area(data = subset(d2, y >= 0), fill = "blue", alpha = 0.2) +
  geom_point(aes(color = grp), size = 10) +
  theme_bw()

Your data:

orig <- orig[order(orig$year), ]

rx <- do.call("rbind",
              sapply(1:(nrow(orig) - 1), function(i){
                f <- lm(year ~ afw, orig[i:(i+1), ])
                if (f$qr$rank < 2) return(NULL)
                r <- predict(f, newdata = data.frame(afw = 0))
                if(orig[i, ]$year < r & r < orig[i + 1, ]$year)
                  return(data.frame(year = r, afw = 0))
                else return(NULL)
              })
)
d2 <- rbind(orig, rx)

ggplot(d2, aes(x = year, y = afw)) +
  geom_area(data = subset(d2, afw <= 0), fill = "red") +
  geom_area(data = subset(d2, afw >= 0), fill = "blue") +
  scale_x_continuous("", expand = c(0,0), breaks = seq(1910, 2010, 10)) +
  theme_bw()

So this is not perfect and I'm interested to see what others come up with...

The reason for the "multiple" colored areas is that a single polygon is bounded by the data points and the data points are not actually zero.

To solve this, we can interpolate using approx(). For a perfect solution, you would need to determine exactly where the line crosses zero.

interp <- approx(orig$year, orig$afw, n=10000)

orig2 <- data.frame(year=interp$x, afw=interp$y)
orig2$col[orig2$afw >= 0] <- "pos"
orig2$col[orig2$afw < 0] <- "neg"

ggplot(orig2, aes(x=year, y=afw)) +
  geom_area(aes(fill=col)) +
  geom_line() +
  geom_hline(yintercept=0)

However, you will see this still has issues when you zoom:


To elaborate on my statement above (and further illustrate the original "problem/issue"), consider what happens when you plot each of the original positive and negative datasets separately:

p1 <- ggplot(subset(orig, col == "neg"), aes(x = year, y = afw)) +
  geom_area(aes(fill=col)) +
  scale_fill_manual(values = c("#FF3030", "#00CC66"))

p2 <- ggplot(subset(orig, col == "pos"), aes(x = year, y = afw)) +
  geom_area(aes(fill=col)) +
  scale_fill_manual(values = c("#00CC66", "#FF3030"))

library(gridExtra)

grid.arrange(p2, p1)


Of course, you could always solve this by utilizing a different type of visualization:

ggplot(data = orig, aes(x = year, y = afw)) +
  geom_bar(stat = "identity", aes(fill=col), colour = "white")

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!