Can anyone suggest a good solution to remove duplicates from nested lists if wanting to evaluate duplicates based on first element of each nested list?
The main list
If the order does not matter, code below
print [ [k] + v for (k, v) in dict( [ [a[0], a[1:]] for a in reversed(L) ] ).items() ]
gives
[['2', '5', '6'], ['14', '65', '76'], ['7', '12', '33']]
Use Pandas :
import pandas as pd
L = [['14', '65', 76], ['2', '5', 6], ['7', '12', 33], ['14', '22', 46],['7','a','b']]
df = pd.DataFrame(L)
df = df.drop_duplicates()
L_no_duplicates = df.values.tolist()
If you want to drop duplicates in specific columns only use instead:
df = df.drop_duplicates([1,2])
i am not sure what you meant by "another list", so i assume you are saying those lists inside L
a=[]
L = [['14', '65', 76], ['2', '5', 6], ['7', '12', 33], ['14', '22', 46],['7','a','b']]
for item in L:
if not item[0] in a:
a.append(item[0])
print item
use a dict instead like so:
L = {'14': ['65', 76], '2': ['5', 6], '7': ['12', 33]}
L['14'] = ['22', 46]
if you are receiving the first list from some external source, convert it like so:
L = [['14', '65', 76], ['2', '5', 6], ['7', '12', 33], ['14', '22', 46]]
L_dict = dict((x[0], x[1:]) for x in L)
Do you care about preserving order / which duplicate is removed? If not, then:
dict((x[0], x) for x in L).values()
will do it. If you want to preserve order, and want to keep the first one you find then:
def unique_items(L):
found = set()
for item in L:
if item[0] not in found:
yield item
found.add(item[0])
print list(unique_items(L))