How to split a DataFrame in pandas in predefined percentages?

后端 未结 2 479
不知归路
不知归路 2020-12-19 00:38

I have a pandas dataframe sorted by a number of columns. Now I\'d like to split the dataframe in predefined percentages, so as to extract and name a few segments.

F

2条回答
  •  醉话见心
    2020-12-19 01:06

    Use numpy.split:

    a, b, c = np.split(df, [int(.2*len(df)), int(.5*len(df))])
    

    Sample:

    np.random.seed(100)
    df = pd.DataFrame(np.random.random((20,5)), columns=list('ABCDE'))
    #print (df)
    
    a, b, c = np.split(df, [int(.2*len(df)), int(.5*len(df))])
    print (a)
              A         B         C         D         E
    0  0.543405  0.278369  0.424518  0.844776  0.004719
    1  0.121569  0.670749  0.825853  0.136707  0.575093
    2  0.891322  0.209202  0.185328  0.108377  0.219697
    3  0.978624  0.811683  0.171941  0.816225  0.274074
    
    print (b)
              A         B         C         D         E
    4  0.431704  0.940030  0.817649  0.336112  0.175410
    5  0.372832  0.005689  0.252426  0.795663  0.015255
    6  0.598843  0.603805  0.105148  0.381943  0.036476
    7  0.890412  0.980921  0.059942  0.890546  0.576901
    8  0.742480  0.630184  0.581842  0.020439  0.210027
    9  0.544685  0.769115  0.250695  0.285896  0.852395
    
    print (c)
               A         B         C         D         E
    10  0.975006  0.884853  0.359508  0.598859  0.354796
    11  0.340190  0.178081  0.237694  0.044862  0.505431
    12  0.376252  0.592805  0.629942  0.142600  0.933841
    13  0.946380  0.602297  0.387766  0.363188  0.204345
    14  0.276765  0.246536  0.173608  0.966610  0.957013
    15  0.597974  0.731301  0.340385  0.092056  0.463498
    16  0.508699  0.088460  0.528035  0.992158  0.395036
    17  0.335596  0.805451  0.754349  0.313066  0.634037
    18  0.540405  0.296794  0.110788  0.312640  0.456979
    19  0.658940  0.254258  0.641101  0.200124  0.657625
    

提交回复
热议问题