Order Bars in ggplot2 bar graph

匿名 (未验证) 提交于 2019-12-03 01:58:03

问题:

I am trying to make a bar graph where the largest bar would be nearest to the y axis and the shortest bar would be furthest. So this is kind of like the Table I have

    Name   Position 1   James  Goalkeeper 2   Frank  Goalkeeper 3   Jean   Defense 4   Steve  Defense 5   John   Defense 6   Tim    Striker 

So I am trying to build a bar graph that would show the number of players according to position

p 

but the graph shows the goalkeeper bar first then the defense, and finally the striker one. I would want the graph to be ordered so that the defense bar is closest to the y axis, the goalkeeper one, and finally the striker one. Thanks

回答1:

The key with ordering is to set the levels of the factor in the order you want. An ordered factor is not required; the extra information in an ordered factor isn't necessary and if these data are being used in any statistical model, the wrong parametrisation might result ― polynomial contrasts aren't right for nominal data such as this.

## set the levels in order we want theTable 

In the most general sense, we simply need to set the factor levels to be in the desired order. There are multiple ways of doing this depending on the situation. For instance, we could do:

levels(theTable$Position) 

and simply list the levels in the desired order on the right hand side. You can also specify the level order within the call to factor as above:

theTable$Position 


回答2:

@GavinSimpson: reorder is a powerful and effective solution for this:

ggplot(theTable,        aes(x=reorder(Position,Position,                      function(x)-length(x)))) +        geom_bar() 


回答3:

Using scale_x_discrete (limits = ...) to specify the order of bars.

positions 


回答4:

I think the already provided solutions are overly verbose. A more concise way to do a frequency sorted barplot with ggplot is

ggplot(theTable, aes(x=reorder(Position, -table(Position)[Position]))) + geom_bar() 

It's similar to what Alex Brown suggested, but a bit shorter and works without an anynymous function definition.

Update

I think my old solution was good at the time, but nowadays I'd rather use forcats::fct_infreq which is sorting factor levels by frequency:

require(forcats)  ggplot(theTable, aes(fct_infreq(Position))) + geom_bar() 


回答5:

You just need to specify the Position column to be an ordered factor where the levels are ordered by their counts:

theTable 

(Note that the table(Position) produces a frequency-count of the Position column.)

Then your ggplot function will show the bars in decreasing order of count. I don't know if there's an option in geom_bar to do this without having to explicitly create an ordered factor.



回答6:

A simple dplyr based reordering of factors can solve this problem:

library(dplyr)  #reorder the table and reset the factor to that ordering theTable %>%   group_by(Position) %>%                              # calculate the counts   summarize(counts = n()) %>%   arrange(-counts) %>%                                # sort by counts   mutate(Position = factor(Position, Position)) %>%   # reset factor   ggplot(aes(x=Position, y=counts)) +                 # plot      geom_bar(stat="identity")                         # plot histogram 


回答7:

Like reorder() in Alex Brown's answer, we could also use forcats::fct_reorder(). It will basically sort the factors specified in the 1st arg, according to the values in the 2nd arg after applying a specified function (default = median, which is what we use here as just have one value per factor level).

It is a shame that in the OP's question, the order required is also alphabetical as that is the default sort order when you create factors, so will hide what this function is actually doing. To make it more clear, I'll replace "Goalkeeper" with "Zoalkeeper".

library(tidyverse) library(forcats)  theTable %     count(Position) %>%     mutate(Position = fct_reorder(Position, n, .desc = TRUE)) %>%     ggplot(aes(x = Position, y = n)) + geom_bar(stat = 'identity') 



回答8:

I agree with zach that counting within dplyr is the best solution. I've found this to be the shortest version:

dplyr::count(theTable, Position) %>%           arrange(-n) %>%           mutate(Position = factor(Position, Position)) %>%           ggplot(aes(x=Position, y=n)) + geom_bar(stat="identity") 

This will also be significantly faster than reordering the factor levels beforehand since the count is done in dplyr not in ggplot or using table.



回答9:

In addition to forcats::fct_infreq, mentioned by @HolgerBrandl, there is forcats::fct_rev, which reverses the factor order.

theTable 



标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!