How to subclass pandas DataFrame?

前端 未结 2 1851
[愿得一人]
[愿得一人] 2020-11-28 04:57

Subclassing pandas classes seems a common need but I could not find references on the subject. (It seems that pandas developers are still working on it: https://github.com/p

2条回答
  •  慢半拍i
    慢半拍i (楼主)
    2020-11-28 05:49

    There is now an official guide on how to subclass Pandas data structures, which includes DataFrame as well as Series.

    The guide is available here: https://pandas.pydata.org/pandas-docs/stable/development/extending.html#extending-subclassing-pandas

    The guide mentions this subclassed DataFrame from the Geopandas project as a good example: https://github.com/geopandas/geopandas/blob/master/geopandas/geodataframe.py

    As in HYRY's answer, it seems there are two things you're trying to accomplish:

    1. When calling methods on an instance of your class, return instances of the correct type (your type). For this, you can just add the _constructor property which should return your type.
    2. Adding attributes which will be attached to copies of your object. To do this, you need to store the names of these attributes in a list, as the special _metadata attribute.

    Here's an example:

    class SubclassedDataFrame(DataFrame):
        _metadata = ['added_property']
        added_property = 1  # This will be passed to copies
    
        @property
        def _constructor(self):
            return SubclassedDataFrame
    

提交回复
热议问题