Create a complement of list preserving duplicate values

前端 未结 5 1462
爱一瞬间的悲伤
爱一瞬间的悲伤 2020-12-11 01:32

Given list a = [1, 2, 2, 3] and its sublist b = [1, 2] find a list complementing b in such a way that sorted(a) == sorted(b + complement)

相关标签:
5条回答
  • 2020-12-11 01:36

    In order to reduce complexity to your already valid approach, you could use collections.Counter (which is a specialized dictionary with fast lookup) to count items in both lists.

    Then update the count by substracting values, and in the end filter the list by only keeping items whose count is > 0 and rebuild it/chain it using itertools.chain

    from collections import Counter
    import itertools
    
    a  = [1, 2, 2, 2, 3]
    b = [1, 2]
    
    print(list(itertools.chain.from_iterable(x*[k] for k,x in (Counter(a)-Counter(b)).items() if x > 0)))
    

    result:

    [2, 2, 3]
    
    0 讨论(0)
  • 2020-12-11 01:36

    O(n log n)

    a = [1, 2, 2, 3]
    b = [1, 2]
    a.sort()
    b.sort()
    
    L = []
    i = j = 0
    while i < len(a) and j < len(b):
        if a[i] < b[j]:
            L.append(a[i])
            i += 1
        elif a[i] > b[j]:
            L.append(b[j])
            j += 1
        else:
            i += 1
            j += 1
    
    while i < len(a):
        L.append(a[i])
        i += 1
    
    while j < len(b):
        L.append(b[j])
        j += 1
    
    print(L)
    
    0 讨论(0)
  • 2020-12-11 01:39

    If the order of elements in the complement doesn't matter, then collections.Counter is all that is needed:

    from collections import Counter
    
    a = [1, 2, 3, 2]
    b = [1, 2]
    
    complement = list((Counter(a) - Counter(b)).elements())  # complement = [2, 3]
    

    If the order of items in the complement should be the same order as in the original list, then use something like this:

    from collections import Counter, defaultdict
    from itertools import count
    
    a = [1,2,3,2]
    b = [2,1]
    
    c = Counter(b)
    d = defaultdict(count)
    
    complement = [x for x in a if next(d[x]) >= c[x]]  # complement = [3, 2]
    
    0 讨论(0)
  • 2020-12-11 01:51

    The only more declarative and thus Pythonic way that pops into my mind and that improves performance for large b (and a) is to use some sort of counter with decrement:

    from collections import Counter
    
    class DecrementCounter(Counter):
    
        def decrement(self,x):
            if self[x]:
                self[x] -= 1
                return True
            return False
    

    Now we can use list comprehension:

    b_count = DecrementCounter(b)
    complement = [x for x in a if not b_count.decrement(x)]
    

    Here we thus keep track of the counts in b, for each element in a we look whether it is part of b_count. If that is indeed the case we decrement the counter and ignore the element. Otherwise we add it to the complement. Note that this only works, if we are sure such complement exists.

    After you have constructed the complement, you can check if the complement exists with:

    not bool(+b_count)
    

    If this is False, then such complement cannot be constructed (for instance a=[1] and b=[1,3]). So a full implementation could be:

    b_count = DecrementCounter(b)
    complement = [x for x in a if not b_count.decrement(x)]
    if +b_count:
        raise ValueError('complement cannot be constructed')
    

    If dictionary lookup runs in O(1) (which it usually does, only in rare occasions it is O(n)), then this algorithm runs in O(|a|+|b|) (so the sum of the sizes of the lists). Whereas the remove approach will usually run in O(|a|×|b|).

    0 讨论(0)
  • 2020-12-11 01:56

    Main idea: if the values are not unique, make them unique

    def add_duplicate_position(items):
        element_counter = {}
        for item in items:
            element_counter[item] = element_counter.setdefault(item,-1) + 1
            yield element_counter[item], item
        
    assert list(add_duplicate_position([1, 2, 2, 3])) == [(0, 1), (0, 2), (1, 2), (0, 3)]
    
    def create_complementary_list_with_duplicates(a,b):
        a = list(add_duplicate_position(a))
        b = set(add_duplicate_position(b))
        return [item for _,item in [x for x in a if x not in b]]
      
    a = [1, 2, 2, 3]
    b = [1, 2]
    assert create_complementary_list_with_duplicates(a,b) == [2, 3]
    
    0 讨论(0)
提交回复
热议问题