Sort a factor based on value in one or more other columns

后端 未结 3 782
攒了一身酷
攒了一身酷 2020-12-08 19:10

I\'ve looked through a number of posts about ordering factors, but haven\'t quite found a match for my problem. Unfortunately, my knowledge of R is still pretty rudimentary.

相关标签:
3条回答
  • 2020-12-08 19:31

    Here's a reproducible sample, with solution:

    set.seed(0)
    a = sample(1:20,replace=F)
    b = sample(1:20,replace=F)
    f = as.factor(letters[1:20])
    
    > a
     [1] 18  6  7 10 15  4 13 14  8 20  1  2  9  5  3 16 12 19 11 17
    > b
     [1] 16 18  4 12  3  5  6  1 15 10 19 17  9 11  2  8 20  7 13 14
    > f
     [1] a b c d e f g h i j k l m n o p q r s t
    Levels: a b c d e f g h i j k l m n o p q r s t
    

    Now for the new factor:

    fn = factor(f, levels=unique(f[order(a,b,f)]), ordered=TRUE)
    
    > fn
     [1] a b c d e f g h i j k l m n o p q r s t
    20 Levels: k < l < o < f < n < b < c < i < m < d < s < q < g < h < e < ... < j
    

    Sorted on 'a', next 'b' and finally 'f' itself (although in this example, 'a' has no repeated values).

    0 讨论(0)
  • 2020-12-08 19:46

    The function fct_reorder2 is doing just that.

    Please note the subtlety that fct_reorder is sorting by ascending order while fct_reordering2 is sorting by descending order.

    Code from the documentation:

    df0 <- tibble::tribble(
      ~color,     ~a, ~b,
      "blue",      1,  2,
      "green",     6,  2,
      "purple",    3,  3,
      "red",       2,  3,
      "yellow",    5,  1
    

    )

    df0$color <- factor(df0$color)
    fct_reorder(df0$color, df0$a, min)
     #> [1] blue   green  purple red    yellow
     #> Levels: blue red purple yellow green
    fct_reorder2(df0$color, df0$a, df0$b)
    
    0 讨论(0)
  • 2020-12-08 19:49

    I recommend the following dplyr-based approach (h/t daattali) that can be extended to as many columns as you like:

    library(dplyr)
    Catalog <- Catalog %>%
      arrange(MIDDATE, TYPENAME) %>%               # sort your dataframe
      mutate(IDENTIFY = factor(IDENTIFY, unique(IDENTIFY))) # reset your factor-column based on that order
    
    0 讨论(0)
提交回复
热议问题