pandas .at versus .loc

后端 未结 4 1380
情歌与酒
情歌与酒 2020-11-27 15:20

I\'ve been exploring how to optimize my code and ran across pandas .at method. Per the documentation

Fast label-based scalar ac

4条回答
  •  南方客
    南方客 (楼主)
    2020-11-27 16:19

    As you asked about the limitations of .at, here is one thing I recently ran into (using pandas 0.22). Let's use the example from the documentation:

    df = pd.DataFrame([[0, 2, 3], [0, 4, 1], [10, 20, 30]], index=[4, 5, 6], columns=['A', 'B', 'C'])
    df2 = df.copy()
    
        A   B   C
    4   0   2   3
    5   0   4   1
    6  10  20  30
    

    If I now do

    df.at[4, 'B'] = 100
    

    the result looks as expected

        A    B   C
    4   0  100   3
    5   0    4   1
    6  10   20  30
    

    However, when I try to do

     df.at[4, 'C'] = 10.05
    

    it seems that .at tries to conserve the datatype (here: int):

        A    B   C
    4   0  100  10
    5   0    4   1
    6  10   20  30
    

    That seems to be a difference to .loc:

    df2.loc[4, 'C'] = 10.05
    

    yields the desired

        A   B      C
    4   0   2  10.05
    5   0   4   1.00
    6  10  20  30.00
    

    The risky thing in the example above is that it happens silently (the conversion from float to int). When one tries the same with strings it will throw an error:

    df.at[5, 'A'] = 'a_string'
    

    ValueError: invalid literal for int() with base 10: 'a_string'

    It will work, however, if one uses a string on which int() actually works as noted by @n1k31t4 in the comments, e.g.

    df.at[5, 'A'] = '123'
    
         A   B   C
    4    0   2   3
    5  123   4   1
    6   10  20  30
    

提交回复
热议问题