zip iterators asserting for equal length in python

前端 未结 5 2325
不思量自难忘°
不思量自难忘° 2020-12-15 18:19

I am looking for a nice way to zip several iterables raising an exception if the lengths of the iterables are not equal.

In the case where the iterables

相关标签:
5条回答
  • 2020-12-15 18:47

    I can think of a simpler solution, use itertools.zip_longest() and raise an exception if the sentinel value used to pad out shorter iterables is present in the tuple produced:

    from itertools import zip_longest
    
    def zip_equal(*iterables):
        sentinel = object()
        for combo in zip_longest(*iterables, fillvalue=sentinel):
            if sentinel in combo:
                raise ValueError('Iterables have different lengths')
            yield combo
    

    Unfortunately, we can't use zip() with yield from to avoid a Python-code loop with a test each iteration; once the shortest iterator runs out, zip() would advance all preceding iterators and thus swallow the evidence if there is but one extra item in those.

    0 讨论(0)
  • 2020-12-15 18:52

    Use more_itertools.zip_equal (v8.3.0+):

    Code

    import more_itertools as mit
    

    Demo

    list(mit.zip_equal(range(3), "abc"))
    # [(0, 'a'), (1, 'b'), (2, 'c')]
    
    list(mit.zip_equal(range(3), "abcd"))
    # UnequalIterablesError
    

    more_itertools is a third party package installed via λ pip install more_itertools

    0 讨论(0)
  • 2020-12-15 19:05

    I came up with a solution using sentinel iterable FYI:

    class _SentinelException(Exception):
        def __iter__(self):
            raise _SentinelException
    
    
    def zip_equal(iterable1, iterable2):
        i1 = iter(itertools.chain(iterable1, _SentinelException()))
        i2 = iter(iterable2)
        try:
            while True:
                yield (next(i1), next(i2))
        except _SentinelException:  # i1 reaches end
            try:
                next(i2)  # check whether i2 reaches end
            except StopIteration:
                pass
            else:
                raise ValueError('the second iterable is longer than the first one')
        except StopIteration: # i2 reaches end, as next(i1) has already been called, i1's length is bigger than i2
            raise ValueError('the first iterable is longger the second one.')
    
    0 讨论(0)
  • 2020-12-15 19:06

    Here is an approach that doesn't require doing any extra checks with each loop of the iteration. This could be desirable especially for long iterables.

    The idea is to pad each iterable with a "value" at the end that raises an exception when reached, and then do the needed verification only at the very end. The approach uses zip() and itertools.chain().

    The code below was written for Python 3.5.

    import itertools
    
    class ExhaustedError(Exception):
        def __init__(self, index):
            """The index is the 0-based index of the exhausted iterable."""
            self.index = index
    
    def raising_iter(i):
        """Return an iterator that raises an ExhaustedError."""
        raise ExhaustedError(i)
        yield
    
    def terminate_iter(i, iterable):
        """Return an iterator that raises an ExhaustedError at the end."""
        return itertools.chain(iterable, raising_iter(i))
    
    def zip_equal(*iterables):
        iterators = [terminate_iter(*args) for args in enumerate(iterables)]
        try:
            yield from zip(*iterators)
        except ExhaustedError as exc:
            index = exc.index
            if index > 0:
                raise RuntimeError('iterable {} exhausted first'.format(index)) from None
            # Check that all other iterators are also exhausted.
            for i, iterator in enumerate(iterators[1:], start=1):
                try:
                    next(iterator)
                except ExhaustedError:
                    pass
                else:
                    raise RuntimeError('iterable {} is longer'.format(i)) from None
    

    Below is what it looks like being used.

    >>> list(zip_equal([1, 2], [3, 4], [5, 6]))
    [(1, 3, 5), (2, 4, 6)]
    
    >>> list(zip_equal([1, 2], [3], [4]))
    RuntimeError: iterable 1 exhausted first
    
    >>> list(zip_equal([1], [2, 3], [4]))
    RuntimeError: iterable 1 is longer
    
    >>> list(zip_equal([1], [2], [3, 4]))
    RuntimeError: iterable 2 is longer
    
    0 讨论(0)
  • 2020-12-15 19:11

    An optional boolean keyword argument, strict, is introduced for the built-in zip function in PEP 618.

    Quoting What’s New In Python 3.10:

    The zip() function now has an optional strict flag, used to require that all the iterables have an equal length.

    When enabled, a ValueError is raised if one of the arguments is exhausted before the others.

    0 讨论(0)
提交回复
热议问题