Pandas Groupby Agg Function Does Not Reduce

后端 未结 2 601
日久生厌
日久生厌 2020-12-01 18:14

I am using an aggregation function that I have used in my work for a long time now. The idea is that if the Series passed to the function is of length 1 (i.e. the group only

2条回答
  •  不知归路
    2020-12-01 18:36

    This is a misfeature in DataFrame. If the aggregator returns a list for the first group, it will fail with the error you mention; if it returns a non-list (non-Series) for the first group, it will work fine. The broken code is in groupby.py:

    def _aggregate_series_pure_python(self, obj, func):
    
        group_index, _, ngroups = self.group_info
    
        counts = np.zeros(ngroups, dtype=int)
        result = None
    
        splitter = get_splitter(obj, group_index, ngroups, axis=self.axis)
    
        for label, group in splitter:
            res = func(group)
            if result is None:
                if (isinstance(res, (Series, Index, np.ndarray)) or
                        isinstance(res, list)):
                    raise ValueError('Function does not reduce')
                result = np.empty(ngroups, dtype='O')
    
            counts[label] = group.shape[0]
            result[label] = res
    

    Notice that if result is None and isinstance(res, list. Your options are:

    1. Fake out groupby().agg(), so it doesn't see a list for the first group, or

    2. Do the aggregation yourself, using code like that above but without the erroneous test.

提交回复
热议问题