How to split a data frame?

后端 未结 8 2618
臣服心动
臣服心动 2020-11-22 03:08

I want to split a data frame into several smaller ones. This looks like a very trivial question, however I cannot find a solution from web search.

8条回答
  •  余生分开走
    2020-11-22 03:33

    Splitting the data frame seems counter-productive. Instead, use the split-apply-combine paradigm, e.g., generate some data

    df = data.frame(grp=sample(letters, 100, TRUE), x=rnorm(100))
    

    then split only the relevant columns and apply the scale() function to x in each group, and combine the results (using split<- or ave)

    df$z = 0
    split(df$z, df$grp) = lapply(split(df$x, df$grp), scale)
    ## alternative: df$z = ave(df$x, df$grp, FUN=scale)
    

    This will be very fast compared to splitting data.frames, and the result remains usable in downstream analysis without iteration. I think the dplyr syntax is

    library(dplyr)
    df %>% group_by(grp) %>% mutate(z=scale(x))
    

    In general this dplyr solution is faster than splitting data frames but not as fast as split-apply-combine.

提交回复
热议问题