Identifying lists that have 3 elements in common in a lists of lists

前端未结

关注

 3  810

有刺的猬 2021-01-16 09:00

I have a list of lists. If there are subslists that have the first three elements in common , merge them into one list and add all the fourth elements.

The problem i

3条回答

忘掉有多难 (楼主)

2021-01-16 09:35
You can use the same principle, by using the first three elements as a key, and using int as the default value factory for the defaultdict (so you get 0 as the initial value):
```
from collections import defaultdict

a_list = [['apple', 50, 60, 7],
          ['orange', 70, 50, 8],
          ['apple', 50, 60, 12]]

d = defaultdict(int)
for sub_list in a_list:
    key = tuple(sub_list[:3])
    d[key] += sub_list[-1]

new_data = [list(k) + [v] for k, v in d.iteritems()]
```
If you are using Python 3, you can simplify this to:
```
d = defaultdict(int)
for *key, v in a_list:
    d[tuple(key)] += v

new_data = [list(k) + [v] for k, v in d.items()]
```
because you can use a starred target to take all 'remaining' values from a list, so each sublist is assigned mostly to key and the last value is assigned to v, making the loop just that little simpler (and there is no .iteritems() method on a dict in Python 3, because .items() is an iterator already).

So, we use a defaultdict that uses 0 as the default value, then for each key generated from the first 3 values (as a tuple so you can use it as a dictionary key) sum the last value.
- So for the first item ['apple', 50, 60, 7] we create a key ('apple', 50, 60), look that up in d (where it doesn't exist, but defaultdict will then use int() to create a new value of 0), and add the 7 from that first item.
- Do the same for the ('orange', 70, 50) key and value 8.
- for the 3rd item we get the ('apple', 50, 60) key again and add 12 to the pre-existing 7 in d[('apple', 50, 60)]. for a total of 19.
Then we turn the (key, value) pairs back into lists and you are done. This results in:
```
>>> new_data
[['apple', 50, 60, 19], ['orange', 70, 50, 8]]
```
An alternative implementation that requires sorting the data uses itertools.groupby:
```
from itertools import groupby
from operator import itemgetter

a_list = [['apple', 50, 60, 7],
          ['orange', 70, 50, 8],
          ['apple', 50, 60, 12]]

newlist = [list(key) + [sum(i[-1] for i in sublists)] 
    for key, sublists in groupby(sorted(a_list), key=itemgetter(0, 1, 2))]
```
for the same output. This is going to be slower if your data isn't sorted, but it's good to know of different approaches.
0 讨论(0)

查看其它3个回答
发布评论:

提交评论
- 加载中...