Pythonic way of removing reversed duplicates in list

后端未结

关注

 9  2391

I have a list of pairs:

[0, 1], [0, 4], [1, 0], [1, 4], [4, 0], [4, 1]

and I want to remove any duplicates where

[a,b] == [


                      
              相关标签:


      
      
        
          9条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  南旧        
                
              
                            
                2020-12-06 10:03
              
            
            
                                                                       
An easy and unnested solution:

pairs = [[0, 1], [0, 4], [1, 0], [1, 4], [4, 0], [4, 1]]
s=set()
for p in pairs:
    # Lists are unhashable so make the "elements" into tuples
    p = tuple(p)
    if p not in s and p[::-1] not in s:
        s.add(p)

print s

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  一生所求        
                
              
                            
                2020-12-06 10:06
              
            
            
                                                                       
Well, I am "checking for the reverse pair and append to a list if that's not the case" as you said you could do, but I'm using a single loop.

x=[[0, 1], [0, 4], [1, 0], [1, 4], [4, 0], [4, 1]]
out = []
for pair in x:
    if pair[::-1] not in out:
        out.append(pair)
print out


The advantage over existing answers is being, IMO, more readable. No deep knowledge of the standard library is needed here. And no keeping track of anything complex. The only concept that might be unfamiliar for beginners it that [::-1] reverts the pair.

The performance is O(n**2) though, so do not use if performance is an issue and/or lists are big.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  花落未央        
                
              
                            
                2020-12-06 10:16
              
            
            
                                                                       
You could sort each pair, convert your list of pairs to a set of tuples and back again :

l = [[0, 1], [0, 4], [1, 0], [1, 4], [4, 0], [4, 1]]
[list(tpl) for tpl in list(set([tuple(sorted(pair)) for pair in l]))]
#=> [[0, 1], [1, 4], [0, 4]]


The steps might be easier to understand than a long one-liner :

>>> l = [[0, 1], [0, 4], [1, 0], [1, 4], [4, 0], [4, 1]]
>>> [sorted(pair) for pair in l]
# [[0, 1], [0, 4], [0, 1], [1, 4], [0, 4], [1, 4]]
>>> [tuple(pair) for pair in _]
# [(0, 1), (0, 4), (0, 1), (1, 4), (0, 4), (1, 4)]
>>> set(_)
# set([(0, 1), (1, 4), (0, 4)])
>>> list(_)
# [(0, 1), (1, 4), (0, 4)]
>>> [list(tpl) for tpl in _]
# [[0, 1], [1, 4], [0, 4]]

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  广开言路        
                
              
                            
                2020-12-06 10:17
              
            
            
                                                                       
TL;DR

set(map(frozenset, lst))


Explanation

If the pairs are logically unordered, they're more naturally expressed as sets. It would be better to have them as sets before you even get to this point, but you can convert them like this:

lst = [[0, 1], [0, 4], [1, 0], [1, 4], [4, 0], [4, 1]]
lst_as_sets = map(frozenset, lst)


And then the natural way of eliminating duplicates in an iterable is to convert it to a set:

deduped = set(lst_as_sets)


(This is the main reason I chose frozenset in the first step. Mutable sets are not hashable, so they can't be added to a set.)

Or you can do it in a single line like in the TL;DR section.

I think this is much simpler, more intuitive, and more closely matches how you think about the data than fussing with sorting and tuples.

Converting back

If for some reason you really need a list of lists as the final result, converting back is trivial:

result_list = list(map(list, deduped))


But it's probably more logical to leave it all as sets as long as possible. I can only think of one reason that you might need this, and that's compatibility with existing code/libraries.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  闹比i        
                
              
                            
                2020-12-06 10:22
              
            
            
                                                                       
EDITED to better explain

First get each list sorted and next use the dictionaries keys to get a unique set of elements and them list comprehension.


Why tuples?
Replacing lists with tuples is necessary to avoid the "unhashable" error when passing through the fromkeys() function

my_list = [[0, 1], [0, 4], [1, 0], [1, 4], [4, 0], [4, 1]]
tuple_list = [ tuple(sorted(item)) for item in my_list ]
final_list = [ list(item) for item in list({}.fromkeys(tuple_list)) ]


Using OrderedDict even preserve the list order.


from collections import OrderedDict

my_list = [[0, 1], [0, 4], [1, 0], [1, 4], [4, 0], [4, 1]]
tuple_list = [ tuple(sorted(item)) for item in my_list ]
final_list = [ list(item) for item in list(OrderedDict.fromkeys(tuple_list)) ]


The above code will result in the desired list

[[0, 1], [0, 4], [1, 4]]

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  南笙        
                
              
                            
                2020-12-06 10:22
              
            
            
                                                                       
If the order of pairs and pair-items matters, creating a new list by testing for membership might be the way to go here. 

pairs = [0, 1], [0, 4], [1, 0], [1, 4], [4, 0], [4, 1]
no_dups = []
for pair in pairs:
    if not any( all( i in p for i in pair ) for p in no_dups ):
        no_dups.append(pair)


Otherwise, I'd go with Styvane's answer. 

Incidentally, the above solution will not work for cases in which you have matching pairs. For example, [0,0] would not be added to the list. For that, you'd need to add an additional check: 

for pair in pairs:
    if not any( all( i in p for i in pair ) for p in no_dups ) or ( len(set(pair)) == 1 and not pair in no_dups ):
        no_dups.append(pair)


However, that solution will not pick up empty "pairs" (eg, []). For that, you'll need one more adjustment: 

    if not any( all( i in p for i in pair ) for p in no_dups ) or ( len(set(pair)) in (0,1) and not pair in no_dups ):
        no_dups.append(pair)


The and not pair in no_dups bit is required to prevent adding the [0,0] or [] to no_dups twice. 
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
   
          
     1
2
下一页
           
           
        
                                  
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复