R geom_tile ggplot2 what kind of stat is applied?

风格不统一 提交于 2019-12-18 06:54:47

问题


I used geom_tile() for plot 3 variables on the same graph... with

tile_ruined_coop<-ggplot(data=df.1[sel1,])+
  geom_tile(aes(x=bonus, y=malus, fill=rf/300))+
  scale_fill_gradient(name="vr")+
  facet_grid(Seuil_out_coop_i ~ nb_coop_init)
tile_ruined_coop

and I am pleased with the result !

But What kind of statistical treatment is applied to fill ? Is this a mean ?


回答1:


To plot the mean of the fill values you should aggregate your values, before plotting. The scale_colour_gradient(...) does not work on the data level, but on the visualization level. Let's start with a toy Dataframe to build a reproducible example to work with.

mydata = expand.grid(bonus = seq(0, 1, 0.25), malus = seq(0, 1, 0.25), type = c("Risquophile","Moyen","Risquophobe"))
mydata = do.call("rbind",replicate(40, mydata, simplify = FALSE))
mydata$value= runif(nrow(mydata), min=0, max=50)
mydata$coop = "cooperative"

Now, before plotting I suggest you to calculate the mean over your groups of 40 values, and for this operation like to use the dplyr package:

library(dplyr)
data = mydata %>% group_by("bonus","malus","type","coop") %>% summarise(vr=mean(value))

Tow you have your dataset ready to plot with ggplot2:

library(ggplot2)
g = ggplot(data, aes(x=bonus,y=malus,fill=vr))
g = g + geom_tile()
g = g + facet_grid(type~coop)

and this is the result:

where you are sure that the fill value is exactly the mean of your values.
Is this what you expected?




回答2:


It uses stat_identity as can be seen in the documentation. You can test that easily:

DF <- data.frame(x=c(rep(1:2, 2), 1), 
                 y=c(rep(1:2, each=2), 1), 
                 fill=1:5)

#  x y fill
#1 1 1    1
#2 2 1    2
#3 1 2    3
#4 2 2    4
#5 1 1    5

p <- ggplot(data=DF) +
  geom_tile(aes(x=x, y=y, fill=fill))

print(p)

As you see the fill value for the 1/1 combination is 5. If you use factors it's even more clear what happens:

p <- ggplot(data=DF) +
  geom_tile(aes(x=x, y=y, fill=factor(fill)))

print(p)

If you want to depict means, I'd suggest to calculate them outside of ggplot2:

library(plyr)
DF1 <- ddply(DF, .(x, y), summarize, fill=mean(fill))
p <- ggplot(data=DF1) +
  geom_tile(aes(x=x, y=y, fill=fill))

print(p)

That's easier than trying to find out if stat_summary can play with geom_tile somehow (I doubt it).




回答3:


scale_fill() and geom_tile() apply no statistics -or better apply stat_identity()- to your fill value=rf/300. It just computes how many colors you use and then generates the colors with the munsell function 'mnsl()'. If you want to apply some statistics only to the colors displayed you should use:

scale_colour_gradient(trans = "log")

or

scale_colour_gradient(trans = "sqrt")

Changing the colors among the tiles could not be the best idea since the plots have to be comparable, and you compare the values by their colours. Hope this helps



来源:https://stackoverflow.com/questions/25341581/r-geom-tile-ggplot2-what-kind-of-stat-is-applied

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!