Pandas Groupby Agg Function Does Not Reduce

后端 未结 2 578
日久生厌
日久生厌 2020-12-01 18:14

I am using an aggregation function that I have used in my work for a long time now. The idea is that if the Series passed to the function is of length 1 (i.e. the group only

相关标签:
2条回答
  • 2020-12-01 18:36

    This is a misfeature in DataFrame. If the aggregator returns a list for the first group, it will fail with the error you mention; if it returns a non-list (non-Series) for the first group, it will work fine. The broken code is in groupby.py:

    def _aggregate_series_pure_python(self, obj, func):
    
        group_index, _, ngroups = self.group_info
    
        counts = np.zeros(ngroups, dtype=int)
        result = None
    
        splitter = get_splitter(obj, group_index, ngroups, axis=self.axis)
    
        for label, group in splitter:
            res = func(group)
            if result is None:
                if (isinstance(res, (Series, Index, np.ndarray)) or
                        isinstance(res, list)):
                    raise ValueError('Function does not reduce')
                result = np.empty(ngroups, dtype='O')
    
            counts[label] = group.shape[0]
            result[label] = res
    

    Notice that if result is None and isinstance(res, list. Your options are:

    1. Fake out groupby().agg(), so it doesn't see a list for the first group, or

    2. Do the aggregation yourself, using code like that above but without the erroneous test.

    0 讨论(0)
  • 2020-12-01 18:52

    I can't really explain you why, but from my experience list in pandas.DataFrame don't work all that well.

    I usually use tuple instead. That will work:

    def MakeList(x):
        T = tuple(x)
        if len(T) > 1:
            return T
        else:
            return T[0]
    
    DF_Agg = DFGrouped.agg({'s.m.v.' : MakeList})
    
         date line_code           s.m.v.
    0  2013-04-02    401101   (7.76, 25.564)
    1  2013-04-02    401102           25.564
    2  2013-04-02    401103             9.55
    3  2013-04-02    401104             4.87
    4  2013-04-02    401105   (7.76, 25.564)
    5  2013-04-02    401106  (5.282, 25.564)
    6  2013-04-02    401107            5.282
    
    0 讨论(0)
提交回复
热议问题