Splitting dataframe into multiple dataframes

后端 未结 11 1289
南方客
南方客 2020-11-22 01:16

I have a very large dataframe (around 1 million rows) with data from an experiment (60 respondents).

I would like to split the dataframe into 60 dataframes (a datafra

11条回答
  •  耶瑟儿~
    2020-11-22 01:37

    You can use the groupby command, if you already have some labels for your data.

     out_list = [group[1] for group in in_series.groupby(label_series.values)]
    

    Here's a detailed example:

    Let's say we want to partition a pd series using some labels into a list of chunks For example, in_series is:

    2019-07-01 08:00:00   -0.10
    2019-07-01 08:02:00    1.16
    2019-07-01 08:04:00    0.69
    2019-07-01 08:06:00   -0.81
    2019-07-01 08:08:00   -0.64
    Length: 5, dtype: float64
    

    And its corresponding label_series is:

    2019-07-01 08:00:00   1
    2019-07-01 08:02:00   1
    2019-07-01 08:04:00   2
    2019-07-01 08:06:00   2
    2019-07-01 08:08:00   2
    Length: 5, dtype: float64
    

    Run

    out_list = [group[1] for group in in_series.groupby(label_series.values)]
    

    which returns out_list a list of two pd.Series:

    [2019-07-01 08:00:00   -0.10
    2019-07-01 08:02:00   1.16
    Length: 2, dtype: float64,
    2019-07-01 08:04:00    0.69
    2019-07-01 08:06:00   -0.81
    2019-07-01 08:08:00   -0.64
    Length: 3, dtype: float64]
    

    Note that you can use some parameters from in_series itself to group the series, e.g., in_series.index.day

提交回复
热议问题