itertools.groupby() not grouping correctly

跟風遠走 提交于 2020-01-08 17:43:13

问题


I have this data:

self.data = [(1, 1, 5.0),
             (1, 2, 3.0),
             (1, 3, 4.0),
             (2, 1, 4.0),
             (2, 2, 2.0)]

When I run this code:

for mid, group in itertools.groupby(self.data, key=operator.itemgetter(0)):

for list(group) I get:

[(1, 1, 5.0),
 (1, 2, 3.0),
 (1, 3, 4.0)]

which is what I want.

But if I use 1 instead of 0

for mid, group in itertools.groupby(self.data, key=operator.itemgetter(1)):

to group by the second number in the tuples, I only get:

[(1, 1, 5.0)]

even though there are other tuples that have "1" in that 1 (2nd) position.


回答1:


itertools.groupby collects together contiguous items with the same key. If you want all items with the same key, you have to sort self.data first.

for mid, group in itertools.groupby(
    sorted(self.data,key=operator.itemgetter(1)), key=operator.itemgetter(1)):



回答2:


Variant without sorting (via dictionary). Should be better performance-wise.

def full_group_by(l, key=lambda x: x):
    d = defaultdict(list)
    for item in l:
        d[key(item)].append(item)
    return d.items()



回答3:


Below "fixes" several annoyances with Python's itertools.groupby.

def groupby2(l, key=lambda x:x, val=lambda x:x, agg=lambda x:x, sort=True):
    if sort:
        l = sorted(l, key=key)
    return ((k, agg((val(x) for x in v))) \
        for k,v in itertools.groupby(l, key=key))

Specifically,

  1. It doesn't require that you sort your data.
  2. It doesn't require that you must use key as named parameter only.
  3. The output is clean generator of tuple(key, grouped_values) where values are specified by 3rd parameter.
  4. Ability to apply aggregation functions like sum or avg easily.

Example Usage

import itertools
from operator import itemgetter
from statistics import *

t = [('a',1), ('b',2), ('a',3)]
for k,v in groupby2(t, itemgetter(0), itemgetter(1), sum):
  print(k, v)

This prints,

a 4
b 2

Play with this code



来源:https://stackoverflow.com/questions/8116666/itertools-groupby-not-grouping-correctly

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!