What is the point of views in pandas if it is undefined whether an indexing operation returns a view or a copy?

前端 未结 2 1029
误落风尘
误落风尘 2020-12-31 15:05

I have switched from R to pandas. I routinely get SettingWithCopyWarnings, when I do something like

df_a = pd.DataFram         


        
2条回答
  •  抹茶落季
    2020-12-31 15:56

    Great question!

    The short answer is: this is a flaw in pandas that's being remedied.

    You can find a longer discussion of the nature of the problem here, but the main take-away is that we're now moving to a "copy-on-write" behavior in which any time you slice, you get a new copy, and you never have to think about views. The fix will soon come through this refactoring project. I actually tried to fix it directly (see here), but it just wasn't feasible in the current architecture.

    In truth, we'll keep views in the background -- they make pandas SUPER memory efficient and fast when they can be provided -- but we'll end up hiding them from users so, from the user perspective, if you slice, index, or cut a DataFrame, what you get back will effectively be a new copy.

    (This is accomplished by creating views when the user is only reading data, but whenever an assignment operation is used, the view will be converted to a copy before the assignment takes place.)

    Best guess is the fix will be in within a year -- in the mean time, I'm afraid some .copy() may be necessary, sorry!

提交回复
热议问题