I have seen some similar answers, but I can\'t find something specific for this case:
I have a list of dictionaries like this:
[
{\"element\":Bla, \
Apologies for terrible variable names. There is probably a cleaner way but this should work
seen = {(item["element"], item["version"]): False for item in mylist}
output = []
for item in mylist:
item_key = (item["element"], item["version"])
if not seen[item_key]:
output.append(item)
seen[item_key] = True
Pandas can solve this quickly:
import pandas as pd
Bla = "Bla"
d = [
{"element":Bla, "version":2, "date":"12/04/12"},
{"element":Bla, "version":2, "date":"12/05/12"},
{"element":Bla, "version":3, "date":"12/04/12"}
]
df = pd.DataFrame(d)
df[~df.drop("date", axis=1).duplicated()]
output:
date element version
0 12/04/12 Bla 2
2 12/04/12 Bla 3
You say you have a lot of other keys in the dictionary not mentioned in the question.
Here is O(n) algorithm to do what you need:
>>> seen = set()
>>> result = []
>>> for d in dicts:
... h = d.copy()
... h.pop('date')
... h = tuple(h.items())
... if h not in seen:
... result.append(d)
... seen.add(h)
>>> pprint(result)
[{'date': '12/04/12', 'element': 'Bla', 'version': 2},
{'date': '12/04/12', 'element': 'Bla', 'version': 3}]
h is a copy of the dict. date key is removed from it with pop.
Then tuple is created as a hashable type which can be added to set.
If h has never been seen before, we append it to result and add to seen. Additions to seen is O(1) as well as lookups (h not in seen).
At the end, result contains only unique elements in terms of defined h values.
You could use the "unique_everseen" recipe from itertools to create a new list.
list(unique_everseen(original_list, key=lambda e: '{element}@{version}'.format(**e)))
If your "key" needs to be wider than the lambda I have written (to accomodate more values), then it's probably worth extracting to a function:
def key_without_date(element):
return '@'.join(["{}".format(v) for k,v in element.iteritems() if k != 'date'])
list(unique_everseen(original_list, key=key_without_date))
This works:
LoD=[
{"element":'Bla', "version":2, 'list':[1,2,3], "date":"12/04/12"},
{"element":'Bla', "version":2, 'list':[1,2,3], "date":"12/05/12"},
{"element":'Bla', "version":3, 'list':[1,2,3], "date":"12/04/12"}
]
LoDcopy=[]
seen=set()
for d in LoD:
dc=d.copy()
del dc['date']
s=dc.__str__()
if s in seen: continue
seen.add(s)
LoDcopy.append(d)
print LoDcopy
prints:
[{'date': '12/04/12', 'version': 2, 'list': [1, 2, 3], 'element': 'Bla'},
{'date': '12/04/12', 'version': 3, 'list': [1, 2, 3], 'element': 'Bla'}]