Efficient Algorithm for String Concatenation with Overlap

前端 未结 12 2245
轻奢々
轻奢々 2020-12-02 21:32

We need to combine 3 columns in a database by concatenation. However, the 3 columns may contain overlapping parts and the parts should not be duplicated. For example,

<
12条回答
  •  时光说笑
    2020-12-02 21:54

    Here's a solution in Python. It should be faster just by not needing to build substrings in memory all the time. The work is done in the _concat function, which concatenates two strings. The concat function is a helper that concatenates any number of strings.

    def concat(*args):
        result = ''
        for arg in args:
            result = _concat(result, arg)
        return result
    
    def _concat(a, b):
        la = len(a)
        lb = len(b)
        for i in range(la):
            j = i
            k = 0
            while j < la and k < lb and a[j] == b[k]:
                j += 1
                k += 1
            if j == la:
                n = k
                break
        else:
            n = 0
        return a + b[n:]
    
    if __name__ == '__main__':
        assert concat('a', 'b', 'c') == 'abc'
        assert concat('abcde', 'defgh', 'ghlmn') == 'abcdefghlmn'
        assert concat('abcdede', 'dedefgh', '') == 'abcdedefgh'
        assert concat('abcde', 'd', 'ghlmn') == 'abcdedghlmn'
        assert concat('abcdef', '', 'defghl') == 'abcdefghl'
    

提交回复
热议问题