How to subclass pandas DataFrame?

前端 未结 2 1853
[愿得一人]
[愿得一人] 2020-11-28 04:57

Subclassing pandas classes seems a common need but I could not find references on the subject. (It seems that pandas developers are still working on it: https://github.com/p

2条回答
  •  南笙
    南笙 (楼主)
    2020-11-28 05:45

    For Requirement 1, just define _constructor:

    import pandas as pd
    import numpy as np
    
    class MyDF(pd.DataFrame):
        @property
        def _constructor(self):
            return MyDF
    
    
    mydf = MyDF(np.random.randn(3,4), columns=['A','B','C','D'])
    print type(mydf)
    
    mydf_sub = mydf[['A','C']]
    print type(mydf_sub)
    

    I think there is no simple solution for Requirement 2, I think you need define __init__, copy, or do something in _constructor, for example:

    import pandas as pd
    import numpy as np
    
    class MyDF(pd.DataFrame):
        _attributes_ = "myattr1,myattr2"
    
        def __init__(self, *args, **kw):
            super(MyDF, self).__init__(*args, **kw)
            if len(args) == 1 and isinstance(args[0], MyDF):
                args[0]._copy_attrs(self)
    
        def _copy_attrs(self, df):
            for attr in self._attributes_.split(","):
                df.__dict__[attr] = getattr(self, attr, None)
    
        @property
        def _constructor(self):
            def f(*args, **kw):
                df = MyDF(*args, **kw)
                self._copy_attrs(df)
                return df
            return f
    
    mydf = MyDF(np.random.randn(3,4), columns=['A','B','C','D'])
    print type(mydf)
    
    mydf_sub = mydf[['A','C']]
    print type(mydf_sub)
    
    mydf.myattr1 = 1
    mydf_cp1 = MyDF(mydf)
    mydf_cp2 = mydf.copy()
    print mydf_cp1.myattr1, mydf_cp2.myattr1
    

提交回复
热议问题