Interleave different length lists, elimating duplicates, and preserve order

后端 未结 6 2398
青春惊慌失措
青春惊慌失措 2021-02-11 22:00

I have two lists, let\'s say:

keys1 = [\'A\', \'B\', \'C\', \'D\', \'E\',           \'H\', \'I\']
keys2 = [\'A\', \'B\',           \'E\', \'F\', \'G\', \'H\',            


        
6条回答
  •  深忆病人
    2021-02-11 22:40

    What you need is basically what any merge utility does: It tries to merge two sequences, while keeping the relative order of each sequence. You can use Python's difflib module to diff the two sequences, and merge them:

    from difflib import SequenceMatcher
    
    def merge_sequences(seq1,seq2):
        sm=SequenceMatcher(a=seq1,b=seq2)
        res = []
        for (op, start1, end1, start2, end2) in sm.get_opcodes():
            if op == 'equal' or op=='delete':
                #This range appears in both sequences, or only in the first one.
                res += seq1[start1:end1]
            elif op == 'insert':
                #This range appears in only the second sequence.
                res += seq2[start2:end2]
            elif op == 'replace':
                #There are different ranges in each sequence - add both.
                res += seq1[start1:end1]
                res += seq2[start2:end2]
        return res
    

    Example:

    >>> keys1 = ['A', 'B', 'C', 'D', 'E',           'H', 'I']
    >>> keys2 = ['A', 'B',           'E', 'F', 'G', 'H',      'J', 'K']
    >>> merge_sequences(keys1, keys2)
    ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K']
    

    Note that the answer you expect is not necessarily the only possible one. For example, if we change the order of sequences here, we get another answer which is just as valid:

    >>> merge_sequences(keys2, keys1)
    ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'J', 'K', 'I']
    

提交回复
热议问题