I have a large dataset that chokes split() in R. I am able to use dplyr group_by (which is a preferred way anyway) but I am unable to persist the r
Since dplyr 0.5.0.9000, the shortest solution that uses group_by() is probably to follow do with a pull:
df %>% group_by(V1) %>% do(data=(.)) %>% pull(data)
Note that, unlike split, this doesn't name the resulting list elements. If this is desired, then you would probably want something like
df %>% group_by(V1) %>% do(data = (.)) %>% with( set_names(data, V1) )
To editorialize a little, I agree with the folks saying that split() is the better option. Personally, I always found it annoying that I have to type the name of the data frame twice (e.g., split( potentiallylongname, potentiallylongname$V1 )), but the issue is easily sidestepped with the pipe:
df %>% split( .$V1 )