Factors in R: more than an annoyance?

后端 未结 7 1511
梦如初夏
梦如初夏 2020-11-28 19:30

One of the basic data types in R is factors. In my experience factors are basically a pain and I never use them. I always convert to characters. I feel oddly like I\'m missi

7条回答
  •  北荒
    北荒 (楼主)
    2020-11-28 20:06

    Factors are an excellent "unique-cases" badging engine. I've recreated this badly many times, and despite a couple of wrinkles occasionally, they are extremely powerful.

    library(dplyr)
    d <- tibble(x = sample(letters[1:10], 20, replace = TRUE))
    
    ## normalize this table into an indexed value across two tables
    id <- tibble(x_u = sort(unique(d$x))) %>% mutate(x_i = row_number())
    di <- tibble(x_i = as.integer(factor(d$x)))
    
    
    ## reconstruct d$x when needed
    d2 <- inner_join(di, id) %>% transmute(x = x_u)
    identical(d, d2)
    ## [1] TRUE
    

    If there's a better way to do this task I'd love to see it, I don't see this capability of factor discussed.

提交回复
热议问题