df %>% split(.$x)
becomes slow for large number of unique values of x. If we instead split the data frame manually into smaller subsets and then
A very nice cheat exploiting the group_split
of dplyr 0.8.3 or above :
random_df <- tibble(colA= paste("A",1:1200000,sep = "_"),
colB= as.character(paste("A",1:1200000,sep = "_")),
colC= 1:1200000)
random_df_list <- split(random_df, random_df$colC)
random_df_list <- random_df %>% group_split(colC)
Reduces an operation of a few minutes to a few seconds !