Propagate pandas series metadata through joins

后端 未结 1 925
长发绾君心
长发绾君心 2021-01-01 04:47

I\'d like to be able attach metadata to the series of dataframes (specifically, the original filename), so that after joining two dataframes I can see metadata on where each

相关标签:
1条回答
  • 2021-01-01 05:11

    I think something like this will work (and if not, pls file a bug report as this, while supported is a bit bleading edge, iow it IS possible that the join methods don't call this all the time. That is a bit untested).

    See this issue for a more detailed example/bug fix.

    DataFrame._metadata = ['name','filename']
    
    
    def __finalize__(self, other, method=None, **kwargs):
        """
        propagate metadata from other to self
    
        Parameters
        ----------
        other : the object from which to get the attributes that we are going
            to propagate
        method : optional, a passed method name ; possibly to take different
            types of propagation actions based on this
    
        """
    
        ### you need to arbitrate when their are conflicts
    
        for name in self._metadata:
            object.__setattr__(self, name, getattr(other, name, None))
        return self
    
        DataFrame.__finalize__ = __finalize__
    

    So this replaces the default finalizer for DataFrame with your custom one. Where I have indicated, you need to put some code which can arbitrate between conflicts. This is the reason this is not done by default, e.g. frame1 has name 'foo' and frame2 has name 'bar', what do you do when the method is __add__, what about another method?. Let us know what you do and how it works out.

    This is ONLY replacing for DataFrame (and you can simply do the default action if you want), which is to propogate other to self; you can also not set anything except under special cases of method.

    This method is meant to be overriden if sub-classes, that's why you are monkey patching here (rather than sub-classing which is most of the time overkill).

    0 讨论(0)
提交回复
热议问题