python itertools round robin with no duplications

问题

I managed to modify the recipe for roundrobin in https://docs.python.org/3.1/library/itertools.html
to include a limit (stop when reaching X elements) - code below...

Now - what I really want is "stop when reaching X elements but with no element duplication".
Is it even possible? (because it is a generator...)

def roundrobin(limit, *iterables):
    "roundrobin('ABC', 'D', 'EF') --> A D E B F C"
    # Recipe credited to George Sakkis
    pending = len(iterables)
    nexts = cycle(iter(it).next for it in iterables)
    while pending:
        try:
            for next in nexts:
                yield next()
                limit -= 1
                if limit == 0:
                    return

        except StopIteration:
            pending -= 1
            nexts = cycle(islice(nexts, pending))

calling it with:

candidates = [['111', '222', '333'],['444','222','555']] 
list(roundrobin(4, *candidates))

I would like to get:

['111,'444','222','333']

and not:

['111,'444','222','222']

like I'm getting with the current code

回答1:

Here is one possible implementation - I've added a set, named seen, inside the generator function to keep track of the elements we've already yielded. Note that this means that all elements in every iterable must be hashable (if they get reached), which is not a limitation of the base roundrobin.

def roundrobin_limited_nodupe(limit, *iterables):
    """A round-robin iterator duplicates removed and a limit.

        >>> list(roundrobin_limited_nodupe(6, 'ABC', 'DB', 'EFG'))
        ['A', 'D', 'E', 'B', 'F', 'C']  # only six elements, only one 'B'

    Notes:
      - Recipe credited to George Sakkis

    """
    pending = len(iterables)
    seen = set()  # keep track of what we've seen
    nexts = cycle(iter(it).next for it in iterables)
    while pending:
        try:
            for next in nexts:
                candidate = next()
                if candidate not in seen:  # only yield when it's new
                    seen.add(candidate)
                    yield candidate
                    limit -= 1
                    if limit == 0:
                        return
        except StopIteration:
            pending -= 1
            nexts = cycle(islice(nexts, pending))

来源：https://stackoverflow.com/questions/31559525/python-itertools-round-robin-with-no-duplications

标签

python

itertools