How to turn an itertools “grouper” object into a list

一世执手 提交于 2019-11-27 15:10:56

The reason that your first approach doesn't work is that the the groups get "consumed" when you create that list with

list(groupby("cccccaaaaatttttsssssss"))

To quote from the groupby docs

The returned group is itself an iterator that shares the underlying iterable with groupby(). Because the source is shared, when the groupby() object is advanced, the previous group is no longer visible.

Let's break it down into stages.

from itertools import groupby

a = list(groupby("cccccaaaaatttttsssssss"))
print(a)
b = a[0][1]
print(b)
print('So far, so good')
print(list(b))
print('What?!')

output

[('c', <itertools._grouper object at 0xb715104c>), ('a', <itertools._grouper object at 0xb715108c>), ('t', <itertools._grouper object at 0xb71510cc>), ('s', <itertools._grouper object at 0xb715110c>)]
<itertools._grouper object at 0xb715104c>
So far, so good
[]
What?!

Our itertools._grouper object at 0xb715104c is empty because it shares its contents with the "parent" iterator returned by groupby, and those items are now gone because that first list call iterated over the parent.

It's really no different to what happens if you try to iterate twice over any iterator, eg a simple generator expression.

g = (c for c in 'python')
print(list(g))
print(list(g))

output

['p', 'y', 't', 'h', 'o', 'n']
[]

BTW, here's another way to get the length of a groupby group if you don't actually need its contents; it's a little cheaper (and uses less RAM) than building a list just to find its length.

from itertools import groupby

for k, g in groupby("cccccaaaaatttttsssssss"):
    print(k, sum(1 for _ in g))

output

c 5
a 5
t 5
s 7
标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!