How to remove duplicate from list of tuple when order is important

主宰稳场 提交于 2019-12-14 03:53:15

问题


I have seen some similar answers, but I can't find something specific for this case.

I have a list of tuples:

[(5, 0), (3, 1), (3, 2), (5, 3), (6, 4)]

What I want is to remove tuples from this list only when first element of tuple has occurred previously in the list and the tuple which remains should have the smallest second element.

So the output should look like this:

[(5, 0), (3, 1), (6, 4)]


回答1:


Here's a linear time approach that requires two iterations over your original list.

t = [(5, 0), (3, 1), (3, 2), (5, 3), (6, 4)] # test case 1
#t = [(5, 3), (3, 1), (3, 2), (5, 0), (6, 4)] # test case 2
smallest = {}
inf = float('inf')

for first, second in t:
    if smallest.get(first, inf) > second:
        smallest[first] = second

result = []
seen = set()

for first, second in t:
    if first not in seen and second == smallest[first]:
        seen.add(first)
        result.append((first, second))

print(result) # [(5, 0), (3, 1), (6, 4)] for test case 1
              # [(3, 1), (5, 0), (6, 4)] for test case 2



回答2:


Here is a compact version I came up with using OrderedDict and skipping replacement if new value is larger than old.

from collections import OrderedDict

a = [(5, 3), (3, 1), (3, 2), (5, 0), (6, 4)]
d = OrderedDict()

for item in a:

    # Get old value in dictionary if exist
    old = d.get(item[0])

    # Skip if new item is larger than old
    if old:
        if item[1] > old[1]:
            continue
        #else:
        #    del d[item[0]]

    # Assign
    d[item[0]] = item

list(d.values())

Returns:

[(5, 0), (3, 1), (6, 4)]

Or if you use the else-statement (commented out):

[(3, 1), (5, 0), (6, 4)]



回答3:


Seems to me that you need to know two things:

  1. The tuple that has the smallest second element for each first element.
  2. The order to index each first element in the new list

We can get #1 by using itertools.groupby and a min function.

import itertools
import operator

lst = [(3, 1), (5, 3), (5, 0), (3, 2), (6, 4)]
# I changed this slightly to make it harder to accidentally succeed.
# correct final order should be [(3, 1), (5, 0), (6, 4)]

tmplst = sorted(lst, key=operator.itemgetter(0))
groups = itertools.groupby(tmplst, operator.itemgetter(0))
# group by first element, in this case this looks like:
# [(3, [(3, 1), (3, 2)]), (5, [(5, 3), (5, 0)]), (6, [(6, 4)])]
# note that groupby only works on sorted lists, so we need to sort this first

min_tuples = {min(v, key=operator.itemgetter(1)) for _, v in groups}
# give the best possible result for each first tuple. In this case:
# {(3, 1), (5, 0), (6, 4)}
# (note that this is a set comprehension for faster lookups later.

Now that we know what our result set looks like, we can re-tackle lst to get them in the right order.

seen = set()
result = []
for el in lst:
    if el not in min_tuples:  # don't add to result
        continue
    elif el not in seen:      # add to result and mark as seen
        result.append(el)
        seen.add(el)



回答4:


This will do what you need:

# I switched (5, 3) and (5, 0) to demonstrate sorting capabilities.
list_a = [(5, 3), (3, 1), (3, 2), (5, 0), (6, 4)]

# Create a list to contain the results
list_b = []

# Create a list to check for duplicates
l = []

# Sort list_a by the second element of each tuple to ensure the smallest numbers
list_a.sort(key=lambda i: i[1])

# Iterate through every tuple in list_a
for i in list_a:

    # Check if the 0th element of the tuple is in the duplicates list; if not:
    if i[0] not in l:

        # Add the tuple the loop is currently on to the results; and
        list_b.append(i)

        # Add the 0th element of the tuple to the duplicates list
        l.append(i[0])

>>> print(list_b)
[(5, 0), (3, 1), (6, 4)]

Hope this helped!




回答5:


Using enumerate() and list comprehension:

def remove_if_first_index(l):
    return [item for index, item in enumerate(l) if item[0] not in [value[0] for value in l[0:index]]]

Using enumerate() and a for loop:

def remove_if_first_index(l):

    # The list to store the return value
    ret = []

    # Get the each index and item from the list passed
    for index, item in enumerate(l):

        # Get the first number in each tuple up to the index we're currently at
        previous_values = [value[0] for value in l[0:index]]

        # If the item's first number is not in the list of previously encountered first numbers
        if item[0] not in previous_values:
            # Append it to the return list
            ret.append(item)

    return ret

Testing

some_list = [(5, 0), (3, 1), (3, 2), (5, 3), (6, 4)]
print(remove_if_first_index(some_list))
# [(5, 0), (3, 1), (6, 4)]



回答6:


I had this idea without seeing the @Anton vBR's answer.

import collections

inp = [(5, 0), (3, 1), (3, 2), (5, 3), (6, 4)]

od = collections.OrderedDict()
for i1, i2 in inp:
    if i2 <= od.get(i1, i2):
        od.pop(i1, None)
        od[i1] = i2
outp = list(od.items())
print(outp)


来源:https://stackoverflow.com/questions/47246912/how-to-remove-duplicate-from-list-of-tuple-when-order-is-important

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!