Identifying lists that have 3 elements in common in a lists of lists

前端 未结 3 810
有刺的猬
有刺的猬 2021-01-16 09:00

I have a list of lists. If there are subslists that have the first three elements in common , merge them into one list and add all the fourth elements.

The problem i

3条回答
  •  忘掉有多难
    2021-01-16 09:35

    You can use the same principle, by using the first three elements as a key, and using int as the default value factory for the defaultdict (so you get 0 as the initial value):

    from collections import defaultdict
    
    a_list = [['apple', 50, 60, 7],
              ['orange', 70, 50, 8],
              ['apple', 50, 60, 12]]
    
    d = defaultdict(int)
    for sub_list in a_list:
        key = tuple(sub_list[:3])
        d[key] += sub_list[-1]
    
    new_data = [list(k) + [v] for k, v in d.iteritems()]
    

    If you are using Python 3, you can simplify this to:

    d = defaultdict(int)
    for *key, v in a_list:
        d[tuple(key)] += v
    
    new_data = [list(k) + [v] for k, v in d.items()]
    

    because you can use a starred target to take all 'remaining' values from a list, so each sublist is assigned mostly to key and the last value is assigned to v, making the loop just that little simpler (and there is no .iteritems() method on a dict in Python 3, because .items() is an iterator already).

    So, we use a defaultdict that uses 0 as the default value, then for each key generated from the first 3 values (as a tuple so you can use it as a dictionary key) sum the last value.

    • So for the first item ['apple', 50, 60, 7] we create a key ('apple', 50, 60), look that up in d (where it doesn't exist, but defaultdict will then use int() to create a new value of 0), and add the 7 from that first item.

    • Do the same for the ('orange', 70, 50) key and value 8.

    • for the 3rd item we get the ('apple', 50, 60) key again and add 12 to the pre-existing 7 in d[('apple', 50, 60)]. for a total of 19.

    Then we turn the (key, value) pairs back into lists and you are done. This results in:

    >>> new_data
    [['apple', 50, 60, 19], ['orange', 70, 50, 8]]
    

    An alternative implementation that requires sorting the data uses itertools.groupby:

    from itertools import groupby
    from operator import itemgetter
    
    a_list = [['apple', 50, 60, 7],
              ['orange', 70, 50, 8],
              ['apple', 50, 60, 12]]
    
    newlist = [list(key) + [sum(i[-1] for i in sublists)] 
        for key, sublists in groupby(sorted(a_list), key=itemgetter(0, 1, 2))]
    

    for the same output. This is going to be slower if your data isn't sorted, but it's good to know of different approaches.

提交回复
热议问题