Pythonic way to determine whether not null list entries are 'continuous'

后端 未结 11 1326
渐次进展
渐次进展 2021-02-01 01:35

I\'m looking for a way to easily determine if all not None items in a list occur in a single continuous slice. I\'ll use integers as examples of not None items

11条回答
  •  无人共我
    2021-02-01 02:07

    I did some profiling to compare @gnibbler's approach with the groupby approach. @gnibber's approach is consistently faster, esp. for longer lists. E.g., I see about a 50% performance gain for random inputs with length 3-100, with a 50% chance of containing a single int sequence (randomly selected), and otherwise with random values. Test code below. I interspersed the two methods (randomly selecting which one goes first) to make sure any caching effects get cancelled out. Based on this, I'd say that while the groupby approach is more intuitive, @gnibber's approach may be appropriate if profiling indicates that this is an important part of the overall code to optimize -- in that case, appropriate comments should be used to indicate what's going on with the use of all/any to consumer iterator values.

    from itertools import groupby
    import random, time
    
    def contiguous1(seq):
        # gnibber's approach
        seq = iter(seq)
        all(x is None for x in seq)        # Burn through any Nones at the beginning
        any(x is None for x in seq)        # and the first group
        return all(x is None for x in seq) # everthing else (if any) should be None.
    
    def contiguous2(seq):
        return sum(1 for k,g in groupby(seq, lambda x: x is not None) if k) == 1
    
    times = {'contiguous1':0,'contiguous2':0}
    
    for i in range(400000):
        n = random.randint(3,100)
        items = [None] * n
        if random.randint(0,1):
            s = random.randint(0,n-1)
            e = random.randint(0,n-s)
            for i in range(s,e):
                items[i] = 3
        else:
            for i in range(n):
                if not random.randint(0,2):
                    items[i] = 3
        if random.randint(0,1):
            funcs = [contiguous1, contiguous2]
        else:
            funcs = [contiguous2, contiguous1]
        for func in funcs:
            t0 = time.time()
            func(items)
            times[func.__name__] += (time.time()-t0)
    
    print
    for f,t in times.items():
        print '%10.7f %s' % (t, f)
    

提交回复
热议问题