how to uniqify a list of dict in python

后端 未结 6 1866
长发绾君心
长发绾君心 2020-12-10 01:10

I have a list:

d = [{\'x\':1, \'y\':2}, {\'x\':3, \'y\':4}, {\'x\':1, \'y\':2}]

{\'x\':1, \'y\':2} comes more than once I want

相关标签:
6条回答
  • 2020-12-10 01:32

    Avoid this whole problem and use namedtuples instead

    from collections import namedtuple
    
    Point = namedtuple('Point','x y'.split())
    better_d = [Point(1,2), Point(3,4), Point(1,2)]
    print set(better_d)
    
    0 讨论(0)
  • 2020-12-10 01:33

    Dicts aren't hashable, so you can't put them in a set. A relatively efficient approach would be turning the (key, value) pairs into a tuple and hashing those tuples (feel free to eliminate the intermediate variables):

    tuples = tuple(set(d.iteritems()) for d in dicts)
    unique = set(tuples)
    return [dict(pairs) for pairs in unique]
    

    If the values aren't always hashable, this is not possible at all using sets and you'll propably have to use the O(n^2) approach using an in check per element.

    0 讨论(0)
  • 2020-12-10 01:37

    Another dark magic(please don't beat me):

    map(dict, set(map(lambda x: tuple(x.items()), d)))
    
    0 讨论(0)
  • 2020-12-10 01:45

    If your value is hashable this will work:

    >>> [dict(y) for y in set(tuple(x.items()) for x in d)]
    [{'y': 4, 'x': 3}, {'y': 2, 'x': 1}]
    

    EDIT:

    I tried it with no duplicates and it seemed to work fine

    >>> d = [{'x':1, 'y':2}, {'x':3, 'y':4}]
    >>> [dict(y) for y in set(tuple(x.items()) for x in d)]
    [{'y': 4, 'x': 3}, {'y': 2, 'x': 1}]
    

    and

    >>> d = [{'x':1,'y':2}]
    >>> [dict(y) for y in set(tuple(x.items()) for x in d)]
    [{'y': 2, 'x': 1}]
    
    0 讨论(0)
  • 2020-12-10 01:45

    A simple loop:

    tmp=[]
    
    for i in d:
        if i not in tmp:
            tmp.append(i)        
    tmp
    [{'x': 1, 'y': 2}, {'x': 3, 'y': 4}]
    
    0 讨论(0)
  • 2020-12-10 01:54

    tuple the dict won't be okay, if the value of one dict item looks like a list.

    e.g.,

    data = [
      {'a': 1, 'b': 2},
      {'a': 1, 'b': 2},
      {'a': 2, 'b': 3}
    ]
    

    using [dict(y) for y in set(tuple(x.items()) for x in data)] will get the unique data.

    However, same action on such data will be failed:

    data = [
      {'a': 1, 'b': 2, 'c': [1,2]},
      {'a': 1, 'b': 2, 'c': [1,2]},
      {'a': 2, 'b': 3, 'c': [3]}
    ]
    

    ignore the performance, json dumps/loads could be a nice choice.

    data = set([json.dumps(d) for d in data])
    data = [json.loads(d) for d in data]
    
    0 讨论(0)
提交回复
热议问题