Pandas: Appending a row to a dataframe and specify its index label

前端 未结 4 1641
忘掉有多难
忘掉有多难 2020-12-12 23:32

Is there any way to specify the index that I want for a new row, when appending the row to a dataframe?

The original documentation provides the following example:

相关标签:
4条回答
  • 2020-12-13 00:03

    The name of the Series becomes the index of the row in the DataFrame:

    In [99]: df = pd.DataFrame(np.random.randn(8, 4), columns=['A','B','C','D'])
    
    In [100]: s = df.xs(3)
    
    In [101]: s.name = 10
    
    In [102]: df.append(s)
    Out[102]: 
               A         B         C         D
    0  -2.083321 -0.153749  0.174436  1.081056
    1  -1.026692  1.495850 -0.025245 -0.171046
    2   0.072272  1.218376  1.433281  0.747815
    3  -0.940552  0.853073 -0.134842 -0.277135
    4   0.478302 -0.599752 -0.080577  0.468618
    5   2.609004 -1.679299 -1.593016  1.172298
    6  -0.201605  0.406925  1.983177  0.012030
    7   1.158530 -2.240124  0.851323 -0.240378
    10 -0.940552  0.853073 -0.134842 -0.277135
    
    0 讨论(0)
  • 2020-12-13 00:07

    There is another solution. The next code is bad (although I think pandas needs this feature):

    import pandas as pd
    
    # empty dataframe
    a = pd.DataFrame()
    a.loc[0] = {'first': 111, 'second': 222}
    

    But the next code runs fine:

    import pandas as pd
    
    # empty dataframe
    a = pd.DataFrame()
    a = a.append(pd.Series({'first': 111, 'second': 222}, name=0))
    
    0 讨论(0)
  • 2020-12-13 00:14

    df.loc will do the job :

    >>> df = pd.DataFrame(np.random.randn(3, 2), columns=['A','B'])
    >>> df
              A         B
    0 -0.269036  0.534991
    1  0.069915 -1.173594
    2 -1.177792  0.018381
    >>> df.loc[13] = df.loc[1]
    >>> df
               A         B
    0  -0.269036  0.534991
    1   0.069915 -1.173594
    2  -1.177792  0.018381
    13  0.069915 -1.173594
    
    0 讨论(0)
  • 2020-12-13 00:15

    I shall refer to the same sample of data as posted in the question:

    import numpy as np
    import pandas as pd
    df = pd.DataFrame(np.random.randn(8, 4), columns=['A','B','C','D'])
    print('The original data frame is: \n{}'.format(df))
    

    Running this code will give you

    The original data frame is:
    
              A         B         C         D
    0  0.494824 -0.328480  0.818117  0.100290
    1  0.239037  0.954912 -0.186825 -0.651935
    2 -1.818285 -0.158856  0.359811 -0.345560
    3 -0.070814 -0.394711  0.081697 -1.178845
    4 -1.638063  1.498027 -0.609325  0.882594
    5 -0.510217  0.500475  1.039466  0.187076
    6  1.116529  0.912380  0.869323  0.119459
    7 -1.046507  0.507299 -0.373432 -1.024795
    

    Now you wish to append a new row to this data frame, which doesn't need to be copy of any other row in the data frame. @Alon suggested an interesting approach to use df.loc to append a new row with different index. The issue, however, with this approach is if there is already a row present at that index, it will be overwritten by new values. This is typically the case for datasets when row index is not unique, like store ID in transaction datasets. So a more general solution to your question is to create the row, transform the new row data into a pandas series, name it to the index you want to have and then append it to the data frame. Don't forget to overwrite the original data frame with the one with appended row. The reason is df.append returns a view of the dataframe and does not modify its contents. Following is the code:

    row = pd.Series({'A':10,'B':20,'C':30,'D':40},name=3)
    df = df.append(row)
    print('The new data frame is: \n{}'.format(df))
    

    Following would be the new output:

    The new data frame is:
    
               A          B          C          D
    0   0.494824  -0.328480   0.818117   0.100290
    1   0.239037   0.954912  -0.186825  -0.651935
    2  -1.818285  -0.158856   0.359811  -0.345560
    3  -0.070814  -0.394711   0.081697  -1.178845
    4  -1.638063   1.498027  -0.609325   0.882594
    5  -0.510217   0.500475   1.039466   0.187076
    6   1.116529   0.912380   0.869323   0.119459
    7  -1.046507   0.507299  -0.373432  -1.024795
    3  10.000000  20.000000  30.000000  40.000000
    
    0 讨论(0)
提交回复
热议问题