Understanding pandas dataframe indexing

前端 未结 2 624
太阳男子
太阳男子 2020-12-05 05:34

Summary: This doesn\'t work:

df[df.key==1][\'D\'] = 1

but this does:

df.D[df.key==1] = 1

Why?

Rep

2条回答
  •  臣服心动
    2020-12-05 06:26

    The pandas documentation says:

    Returning a view versus a copy

    The rules about when a view on the data is returned are entirely dependent on NumPy. Whenever an array of labels or a boolean vector are involved in the indexing operation, the result will be a copy. With single label / scalar indexing and slicing, e.g. df.ix[3:6] or df.ix[:, 'A'], a view will be returned.

    In df[df.key==1]['D'] you first do boolean slicing (leading to a copy of the Dataframe), then you choose a column ['D'].

    In df.D[df.key==1] = 3.4, you first choose a column, then do boolean slicing on the resulting Series.

    This seems to make the difference, although I must admit that it is a little counterintuitive.

    Edit: The difference was identified by Dougal, see his comment: With version 1, the copy is made as the __getitem__ method is called for the boolean slicing. For version 2, only the __setitem__ method is accessed - thus not returning a copy but just assigning.

提交回复
热议问题