R ggplot2 using ..count.. when using facet_grid

半世苍凉 提交于 2021-02-09 08:31:11

问题


I am using R studio in Ubuntu, with standard updated R and ggplot2

I try to create a histogram in ggplot, and to separate the data by groups.

I need the plot's y axis to say the frequency of each bin in the subgroup that was split by the facet grid.

for example if i have two entries in the data

a group
1 1
2 2

I need to use facet_grid to split by group, and then to show that a has one bar for 1 that is 100% percent of the examples in group 1 and vice versa.

I found out that the way to do it, is using (..count..)/sum(..count) but sum(..count..) will count the frequency of that been in the entire data frame and will give me unwanted results,

I can't find good documentation for deep using of ..count..

question about special ggplot variables

another question about ..count..

There is nothing very comprehensive in the docs,

This is the example code i am using

df <- data.frame(a = 1:10, b = 1:10, group = c(rep(1,5),rep(2,5)))
p<-ggplot(df) + geom_histogram(aes(x = a, y = (..count..)/sum(..count..))) +  
   facet_grid(group ~ .)

You can see that the y axis will contain 0.1 as the highest value, i would like it to show that 100% percent of the 1 values are in group 1 for example. etc.

edit:

Thanks to Jimbou for the answer and reference to a well built walk around that is suitable for discrete data, pls note that the real problem i am having here will need to use continuous data, and bins that group more than one value, furthermore, there is no proper documentation about how to do this with the ..count.. function and therefor I believe this is important to find a solution and not to use walk around


回答1:


Here is a dplyr solution.

df%>% group_by(group)%>%mutate(n = n(), prop = n/sum(n))



回答2:


After a lot of playing around, and very good directions you all gave, i found that with a little addition and blend between Jimbou's and Shayaa's answers, and some added code this works beautifully.

t <- data %>% group_by(group,member,v_rate) %>% tally %>% mutate(f = n/sum(n))

will take the data and will group by group, member, v_rate, and will add count of each group divided by the sum (relative frequency in the group)

than we want to create the histogram with ggplot2 and use those values as the weight function of the histogram, otherwise it was all for vain,

 p <- ggplot(t, aes(x = v_rate, weight = f)) + geom_histogram() + facet_grid(group ~ member)

that works great.




回答3:


You can try:

First calculate length of each group using ave:

df$gr_l <- ave(df$a, df$group, FUN = function(x) length(x))

Get the proportion of each a within the groups using by:

df$gr_prop <- c(by(df, df$group + df$a, FUN = function(x) length(x$a)/unique(x$gr_l) ))

Plot the data.

ggplot(df, aes(x=a, y=gr_prop)) + 
      geom_bar(stat="identity",position='dodge') + 
      facet_grid(group ~ .)

The question is similar to this and that question using ddply or an internal ggplot solution.




回答4:


try ..density.. ? this will give local mass vs local count over overall all-encompassing count as currently written



来源:https://stackoverflow.com/questions/38562040/r-ggplot2-using-count-when-using-facet-grid

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!