How to reorder factor levels in a tidy way?

我的梦境 提交于 2019-11-29 19:35:04

问题


Hi I usually use some code like the following to reorder bars in ggplot or other types of plots.

Normal plot (unordered)

library(tidyverse)
iris.tr <-iris %>% group_by(Species) %>% mutate(mSW = mean(Sepal.Width)) %>%
  select(mSW,Species) %>% 
  distinct()
ggplot(iris.tr,aes(x = Species,y = mSW, color = Species)) +
  geom_point(stat = "identity")

Ordering the factor + ordered plot

iris.tr$Species <- factor(iris.tr$Species,
                          levels = iris.tr[order(iris.tr$mSW),]$Species,
                          ordered = TRUE)
ggplot(iris.tr,aes(x = Species,y = mSW, color = Species)) + 
  geom_point(stat = "identity")

The factor line is extremely unpleasant to me and I wonder why arrange() or some other function can't simplify this. I am missing something?

Note:

This do not work but I would like to know if something like this exists in the tidyverse.

iris.tr <-iris %>% group_by(Species) %>% mutate(mSW = mean(Sepal.Width)) %>%
  select(mSW,Species) %>% 
  distinct() %>% 
  arrange(mSW)
ggplot(iris.tr,aes(x = Species,y = mSW, color = Species)) + 
  geom_point(stat = "identity")

回答1:


Using ‹forcats›:

iris.tr %>%
    mutate(Species = fct_reorder(Species, mSW)) %>%
    ggplot() +
    aes(Species, mSW, color = Species) +
    geom_point()



回答2:


Reordering the factor using base:

iris.ba = iris
iris.ba$Species = with(iris.ba, reorder(Species, Sepal.Width, mean))

Translating to dplyr:

iris.tr = iris %>% mutate(Species = reorder(Species, Sepal.Width, mean))

After that, you can continue on to summarize and plot as in your question.


A couple comments: reordering a factor is modifying a data column. The dplyr command to modify a data column is mutate. All arrange does is re-order rows, this has no effect on the levels of the factor and hence no effect on the order of a legend or axis in ggplot.

All factors have an order for their levels. The difference between an ordered = TRUE factor and a regular factor is how the contrasts are set up in a model. ordered = TRUE should only be used if your factor levels have a meaningful rank order, like "Low", "Medium", "High", and even then it only matters if you are building a model and don't want the default contrasts comparing everything to a reference level.



来源:https://stackoverflow.com/questions/45148897/how-to-reorder-factor-levels-in-a-tidy-way

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!