In python, how does one efficiently find the largest consecutive set of numbers in a list that are not necessarily adjacent?

前端未结

关注

 10  748

For instance, if I have a list

[1,4,2,3,5,4,5,6,7,8,1,3,4,5,9,10,11]

This algorithm should return [1,2,3,4,5,6,7,8,9,10,11].

To cl

相关标签:

10条回答

滥情空心

2020-12-31 12:12

Ok, here's yet another attempt in python:

def popper(l):
    listHolders = []
    pos = 0
    while l:
        appended = False
        item = l.pop()
        for holder in listHolders:
            if item == holder[-1][0]-1:
                appended = True
                holder.append((item, pos))
        if not appended:
            pos += 1
            listHolders.append([(item, pos)])
    longest = []
    for holder in listHolders:
        try:
            if (holder[0][0] < longest[-1][0]) and (holder[0][1] > longest[-1][1]):
                longest.extend(holder)
        except:
            pass
        if len(holder) > len(longest):
            longest = holder
    longest.reverse()
    return [x[0] for x in longest]

Sample inputs and outputs:

>>> demo = list(range(50))
>>> shuffle(demo)
>>> demo
[40, 19, 24, 5, 48, 36, 23, 43, 14, 35, 18, 21, 11, 7, 34, 16, 38, 25, 46, 27, 26, 29, 41, 8, 31, 1, 33, 2, 13, 6, 44, 22, 17,
 12, 39, 9, 49, 3, 42, 37, 30, 10, 47, 20, 4, 0, 28, 32, 45, 15]
>>> popper(demo)
[1, 2, 3, 4]
>>> demo = [1,4,2,3,5,4,5,6,7,8,1,3,4,5,9,10,11]
>>> popper(demo)
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]
>>>

0 讨论(0)

面向向阳花

2020-12-31 12:14
If the result really does have to be a sub-sequence of consecutive ascending integers, rather than merely ascending integers, then there's no need to remember each entire consecutive sub-sequence until you determine which is the longest, you need only remember the starting and ending values of each sub-sequence. So you could do something like this:
```
def longestConsecutiveSequence(sequence):
    # map starting values to largest ending value so far
    map = collections.OrderedDict()

    for i in sequence:
        found = False
        for k, v in map.iteritems():
            if i == v:
                map[k] += 1
                found = True

        if not found and i not in map:
            map[i] = i + 1

    return xrange(*max(map.iteritems(), key=lambda i: i[1] - i[0]))
```
If I run this on the original sample date (i.e. [1,4,2,3,5,4,5,6,7,8,1,3,4,5,9,10,11]) I get:
```
>>> print list(longestConsecutiveSequence([1,4,2,3,5,4,5,6,7,8,1,3,4,5,9,10,11]))
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]
```
If I run it on one of Abhijit's samples [1,2,3,11,12,13,14], I get:
```
>>> print list(longestConsecutiveSequence([1,2,3,11,12,13,14]))
[11, 12, 13, 14]
```
Regrettably, this algorithm is O(n*n) in the worst case.
0 讨论(0)
发布评论:

提交评论
- 加载中...

失恋的感觉

2020-12-31 12:16

Not that clever, not O(n), could use a bit of optimization. But it works.

def longest(seq):
  result = []
  for v in seq:
    for l in result:
      if v == l[-1] + 1:
        l.append(v)
    else:
      result.append([v])
  return max(result, key=len)

0 讨论(0)

名媛妹妹

2020-12-31 12:16
How about using a modified Radix Sort? As JanneKarila pointed out the solution is not O(n). It uses Radix sort, which wikipedia says Radix sort's efficiency is O(k·n) for n keys which have k or fewer digits.

This will only work if you know the range of numbers that we're dealing with so that will be the first step.
1. Look at each element in starting list to find lowest, l and highest, h number. In this case l is 1 and h is 11. Note, if you already know the range for some reason, you can skip this step.
2. Create a result list the size of our range and set each element to null.
3. Look at each element in list and add them to the result list at the appropriate place if needed. ie, the element is a 4, add a 4 to the result list at position 4. result[element] = starting_list[element]. You can throw out duplicates if you want, they'll just be overwritten.
4. Go through the result list to find the longest sequence without any null values. Keep a element_counter to know what element in the result list we're looking at. Keep a curr_start_element set to the beginning element of the current sequence and keep a curr_len of how long the current sequence is. Also keep a longest_start_element and a `longest_len' which will start out as zero and be updated as we move through the list.
5. Return the result list starting at longest_start_element and taking longest_len
EDIT: Code added. Tested and working
```
#note this doesn't work with negative numbers
#it's certainly possible to write this to work with negatives
# but the code is a bit hairier
import sys
def findLongestSequence(lst):
    #step 1
    high = -sys.maxint - 1

    for num in lst:
        if num > high:
            high = num

    #step 2
    result = [None]*(high+1)

    #step 3
    for num in lst:
        result[num] = num

    #step 4
    curr_start_element = 0
    curr_len = 0
    longest_start_element = -1
    longest_len = -1

    for element_counter in range(len(result)):
        if result[element_counter] == None:

            if curr_len > longest_len:
                longest_start_element = curr_start_element
                longest_len = curr_len

            curr_len = 0
            curr_start_element = -1

        elif curr_start_element == -1:
            curr_start_element = element_counter

        curr_len += 1

    #just in case the last element makes the longest
    if curr_len > longest_len:
        longest_start_element = curr_start_element
        longest_len = curr_len


    #step 5
    return result[longest_start_element:longest_start_element + longest_len-1]
```
0 讨论(0)
发布评论:

提交评论
- 加载中...

上一页 1 2