One of the basic data types in R is factors. In my experience factors are basically a pain and I never use them. I always convert to characters. I feel oddly like I\'m missi
Factors are an excellent "unique-cases" badging engine. I've recreated this badly many times, and despite a couple of wrinkles occasionally, they are extremely powerful.
library(dplyr)
d <- tibble(x = sample(letters[1:10], 20, replace = TRUE))
## normalize this table into an indexed value across two tables
id <- tibble(x_u = sort(unique(d$x))) %>% mutate(x_i = row_number())
di <- tibble(x_i = as.integer(factor(d$x)))
## reconstruct d$x when needed
d2 <- inner_join(di, id) %>% transmute(x = x_u)
identical(d, d2)
## [1] TRUE
If there's a better way to do this task I'd love to see it, I don't see this capability of factor discussed.