Emulate split() with dplyr group_by: return a list of data frames

前端 未结 6 585
忘掉有多难
忘掉有多难 2020-11-29 06:07

I have a large dataset that chokes split() in R. I am able to use dplyr group_by (which is a preferred way anyway) but I am unable to persist the r

6条回答
  •  轻奢々
    轻奢々 (楼主)
    2020-11-29 06:30

    Since dplyr 0.5.0.9000, the shortest solution that uses group_by() is probably to follow do with a pull:

    df %>% group_by(V1) %>% do(data=(.)) %>% pull(data)
    

    Note that, unlike split, this doesn't name the resulting list elements. If this is desired, then you would probably want something like

    df %>% group_by(V1) %>% do(data = (.)) %>% with( set_names(data, V1) )
    

    To editorialize a little, I agree with the folks saying that split() is the better option. Personally, I always found it annoying that I have to type the name of the data frame twice (e.g., split( potentiallylongname, potentiallylongname$V1 )), but the issue is easily sidestepped with the pipe:

    df %>% split( .$V1 )
    

提交回复
热议问题