Why does subclassing a DataFrame mutate the original object?

爱⌒轻易说出口 提交于 2019-12-06 05:37:28

I'll add to the warnings. Not that I want to discourage you, I actually applaud your efforts.

However, this won't the last of your questions as to what is going on.

That said, once you run:

super(SubFrame, self).__init__(*args, **kwargs)

self is a bone-fide dataframe. You created it by passing another dataframe to the constructor.

Try this as an experiment

d1 = pd.DataFrame(1, list('AB'), list('XY'))
d2 = pd.DataFrame(d1)

d2.index.name = 'IDX'

d1

     X  Y
IDX      
A    1  1
B    1  1

So the observed behavior is consistent, in that when you construct one dataframe by passing another dataframe to the constructor, you end up pointing to the same objects.

To answer your question, subclassing isn't what is allowing the mutating of the original object... its the way pandas constructs a dataframe from a passed dataframe.

Avoid this by instantiating with a copy

d2 = pd.DataFrame(d1.copy())

What's going on in the __init__

You want to pass on all the args and kwargs to pd.DataFrame.__init__ with the exception of the specific kwargs that are intended for your subclass. In this case, freq and ddof. pop is a convenient way to grab the values and delete the key from kwargs before passing it on to pd.DataFrame.__init__


How I'd implement pipe

def add_freq(df, freq):
    df = df.copy()
    df.index.freq = pd.tseries.frequencies.to_offset(freq)
    return df

df = pd.DataFrame(dict(A=[1, 2]), pd.to_datetime(['2017-03-31', '2017-04-30']))

df.pipe(add_freq, 'M')
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!