Longest equally-spaced subsequence

前端未结

关注

 10  1669

遥遥无期 2020-12-22 19:12

I have a million integers in sorted order and I would like to find the longest subsequence where the difference between consecutive pairs is equal. For example

10条回答

死守一世寂寞 (楼主)

2020-12-22 19:39

This is my 2 cents.

If you have a list called input:

input = [1, 4, 5, 7, 8, 12]

You can build a data structure that for each one of this points (excluding the first one), will tell you how far is that point from anyone of its predecessors:

[1, 4, 5, 7, 8, 12]
 x  3  4  6  7  11   # distance from point i to point 0
 x  x  1  3  4   8   # distance from point i to point 1
 x  x  x  2  3   7   # distance from point i to point 2
 x  x  x  x  1   5   # distance from point i to point 3
 x  x  x  x  x   4   # distance from point i to point 4

Now that you have the columns, you can consider the i-th item of input (which is input[i]) and each number n in its column.

The numbers that belong to a series of equidistant numbers that include input[i], are those which have n * j in the i-th position of their column, where j is the number of matches already found when moving columns from left to right, plus the k-th predecessor of input[i], where k is the index of n in the column of input[i].

Example: if we consider i = 1, input[i] = 4, n = 3, then, we can identify a sequence comprehending 4 (input[i]), 7 (because it has a 3 in position 1 of its column) and 1, because k is 0, so we take the first predecessor of i.

Possible implementation (sorry if the code is not using the same notation as the explanation):

def build_columns(l):
    columns = {}
    for x in l[1:]:
        col = []
        for y in l[:l.index(x)]:
            col.append(x - y)
        columns[x] = col
    return columns

def algo(input, columns):
    seqs = []
    for index1, number in enumerate(input[1:]):
        index1 += 1 #first item was sliced
        for index2, distance in enumerate(columns[number]):
            seq = []
            seq.append(input[index2]) # k-th pred
            seq.append(number)
            matches = 1
            for successor in input[index1 + 1 :]:
                column = columns[successor]
                if column[index1] == distance * matches:
                    matches += 1
                    seq.append(successor)
            if (len(seq) > 2):
                seqs.append(seq)
    return seqs

The longest one:

print max(sequences, key=len)

0 讨论(0)

查看其它10个回答