How to compare a list of lists/sets in python?

后端 未结 8 1775
情深已故
情深已故 2020-12-01 03:47

What is the easiest way to compare the 2 lists/sets and output the differences? Are there any built in functions that will help me compare nested lists/sets?

Inputs:

相关标签:
8条回答
  • 2020-12-01 04:10

    By using set comprehensions, you can make it a one-liner. If you want:

    to get a set of tuples, then:

    Differences = {tuple(i) for i in First_list} ^ {tuple(i) for i in Secnd_list}
    

    Or to get a list of tuples, then:

    Differences = list({tuple(i) for i in First_list} ^ {tuple(i) for i in Secnd_list})
    

    Or to get a list of lists (if you really want), then:

    Differences = [list(j) for j in {tuple(i) for i in First_list} ^ {tuple(i) for i in Secnd_list}]
    

    PS: I read here: https://stackoverflow.com/a/10973817/4900095 that map() function is not a pythonic way to do things.

    0 讨论(0)
  • 2020-12-01 04:13

    i guess you'll have to convert your lists to sets:

    >>> a = {('a', 'b'), ('c', 'd'), ('e', 'f')}
    >>> b = {('a', 'b'), ('h', 'g')}
    >>> a.symmetric_difference(b)
    {('e', 'f'), ('h', 'g'), ('c', 'd')}
    
    0 讨论(0)
  • 2020-12-01 04:19

    http://docs.python.org/library/difflib.html is a good starting place for what you are looking for.

    If you apply it recursively to the deltas, you should be able to handle nested data structures. But it will take some work.

    0 讨论(0)
  • 2020-12-01 04:19

    Note that with this method you will loose the order

    first_set=set(map(tuple,S))
    second_set=set(map(tuple,T))
    print map(list,list(first_set.union(second_set)-(first_set&second_set)))
    
    0 讨论(0)
  • 2020-12-01 04:20
    >>> First_list = [['Test.doc', '1a1a1a', '1111'], ['Test2.doc', '2b2b2b', '2222'], ['Test3.doc', '3c3c3c', '3333']] 
    >>> Secnd_list = [['Test.doc', '1a1a1a', '1111'], ['Test2.doc', '2b2b2b', '2222'], ['Test3.doc', '3c3c3c', '3333'], ['Test4.doc', '4d4d4d', '4444']] 
    
    
    >>> z = [tuple(y) for y in First_list]
    >>> z
    [('Test.doc', '1a1a1a', '1111'), ('Test2.doc', '2b2b2b', '2222'), ('Test3.doc', '3c3c3c', '3333')]
    >>> x = [tuple(y) for y in Secnd_list]
    >>> x
    [('Test.doc', '1a1a1a', '1111'), ('Test2.doc', '2b2b2b', '2222'), ('Test3.doc', '3c3c3c', '3333'), ('Test4.doc', '4d4d4d', '4444')]
    
    
    >>> set(x) - set(z)
    set([('Test4.doc', '4d4d4d', '4444')])
    
    0 讨论(0)
  • 2020-12-01 04:22

    Old question but here's a solution I use for returning unique elements not found in both lists.

    I use this for comparing the values returned from a database and the values generated by a directory crawler package. I didn't like the other solutions I found because many of them could not dynamically handle both flat lists and nested lists.

    def differentiate(x, y):
        """
        Retrieve a unique of list of elements that do not exist in both x and y.
        Capable of parsing one-dimensional (flat) and two-dimensional (lists of lists) lists.
    
        :param x: list #1
        :param y: list #2
        :return: list of unique values
        """
        # Validate both lists, confirm either are empty
        if len(x) == 0 and len(y) > 0:
            return y  # All y values are unique if x is empty
        elif len(y) == 0 and len(x) > 0:
            return x  # All x values are unique if y is empty
    
        # Get the input type to convert back to before return
        try:
            input_type = type(x[0])
        except IndexError:
            input_type = type(y[0])
    
        # Dealing with a 2D dataset (list of lists)
        try:
            # Immutable and Unique - Convert list of tuples into set of tuples
            first_set = set(map(tuple, x))
            secnd_set = set(map(tuple, y))
    
        # Dealing with a 1D dataset (list of items)
        except TypeError:
            # Unique values only
            first_set = set(x)
            secnd_set = set(y)
    
        # Determine which list is longest
        longest = first_set if len(first_set) > len(secnd_set) else secnd_set
        shortest = secnd_set if len(first_set) > len(secnd_set) else first_set
    
        # Generate set of non-shared values and return list of values in original type
        return [input_type(i) for i in {i for i in longest if i not in shortest}]
    
    0 讨论(0)
提交回复
热议问题