In python, how does one efficiently find the largest consecutive set of numbers in a list that are not necessarily adjacent?

前端 未结 10 748

For instance, if I have a list

[1,4,2,3,5,4,5,6,7,8,1,3,4,5,9,10,11]

This algorithm should return [1,2,3,4,5,6,7,8,9,10,11].

To cl

相关标签:
10条回答
  • 2020-12-31 12:12

    Ok, here's yet another attempt in python:

    def popper(l):
        listHolders = []
        pos = 0
        while l:
            appended = False
            item = l.pop()
            for holder in listHolders:
                if item == holder[-1][0]-1:
                    appended = True
                    holder.append((item, pos))
            if not appended:
                pos += 1
                listHolders.append([(item, pos)])
        longest = []
        for holder in listHolders:
            try:
                if (holder[0][0] < longest[-1][0]) and (holder[0][1] > longest[-1][1]):
                    longest.extend(holder)
            except:
                pass
            if len(holder) > len(longest):
                longest = holder
        longest.reverse()
        return [x[0] for x in longest]
    

    Sample inputs and outputs:

    >>> demo = list(range(50))
    >>> shuffle(demo)
    >>> demo
    [40, 19, 24, 5, 48, 36, 23, 43, 14, 35, 18, 21, 11, 7, 34, 16, 38, 25, 46, 27, 26, 29, 41, 8, 31, 1, 33, 2, 13, 6, 44, 22, 17,
     12, 39, 9, 49, 3, 42, 37, 30, 10, 47, 20, 4, 0, 28, 32, 45, 15]
    >>> popper(demo)
    [1, 2, 3, 4]
    >>> demo = [1,4,2,3,5,4,5,6,7,8,1,3,4,5,9,10,11]
    >>> popper(demo)
    [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]
    >>>
    
    0 讨论(0)
  • 2020-12-31 12:14

    If the result really does have to be a sub-sequence of consecutive ascending integers, rather than merely ascending integers, then there's no need to remember each entire consecutive sub-sequence until you determine which is the longest, you need only remember the starting and ending values of each sub-sequence. So you could do something like this:

    def longestConsecutiveSequence(sequence):
        # map starting values to largest ending value so far
        map = collections.OrderedDict()
    
        for i in sequence:
            found = False
            for k, v in map.iteritems():
                if i == v:
                    map[k] += 1
                    found = True
    
            if not found and i not in map:
                map[i] = i + 1
    
        return xrange(*max(map.iteritems(), key=lambda i: i[1] - i[0]))
    

    If I run this on the original sample date (i.e. [1,4,2,3,5,4,5,6,7,8,1,3,4,5,9,10,11]) I get:

    >>> print list(longestConsecutiveSequence([1,4,2,3,5,4,5,6,7,8,1,3,4,5,9,10,11]))
    [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]
    

    If I run it on one of Abhijit's samples [1,2,3,11,12,13,14], I get:

    >>> print list(longestConsecutiveSequence([1,2,3,11,12,13,14]))
    [11, 12, 13, 14]
    

    Regrettably, this algorithm is O(n*n) in the worst case.

    0 讨论(0)
  • 2020-12-31 12:16

    Not that clever, not O(n), could use a bit of optimization. But it works.

    def longest(seq):
      result = []
      for v in seq:
        for l in result:
          if v == l[-1] + 1:
            l.append(v)
        else:
          result.append([v])
      return max(result, key=len)
    
    0 讨论(0)
  • 2020-12-31 12:16

    How about using a modified Radix Sort? As JanneKarila pointed out the solution is not O(n). It uses Radix sort, which wikipedia says Radix sort's efficiency is O(k·n) for n keys which have k or fewer digits.

    This will only work if you know the range of numbers that we're dealing with so that will be the first step.

    1. Look at each element in starting list to find lowest, l and highest, h number. In this case l is 1 and h is 11. Note, if you already know the range for some reason, you can skip this step.

    2. Create a result list the size of our range and set each element to null.

    3. Look at each element in list and add them to the result list at the appropriate place if needed. ie, the element is a 4, add a 4 to the result list at position 4. result[element] = starting_list[element]. You can throw out duplicates if you want, they'll just be overwritten.

    4. Go through the result list to find the longest sequence without any null values. Keep a element_counter to know what element in the result list we're looking at. Keep a curr_start_element set to the beginning element of the current sequence and keep a curr_len of how long the current sequence is. Also keep a longest_start_element and a `longest_len' which will start out as zero and be updated as we move through the list.

    5. Return the result list starting at longest_start_element and taking longest_len

    EDIT: Code added. Tested and working

    #note this doesn't work with negative numbers
    #it's certainly possible to write this to work with negatives
    # but the code is a bit hairier
    import sys
    def findLongestSequence(lst):
        #step 1
        high = -sys.maxint - 1
    
        for num in lst:
            if num > high:
                high = num
    
        #step 2
        result = [None]*(high+1)
    
        #step 3
        for num in lst:
            result[num] = num
    
        #step 4
        curr_start_element = 0
        curr_len = 0
        longest_start_element = -1
        longest_len = -1
    
        for element_counter in range(len(result)):
            if result[element_counter] == None:
    
                if curr_len > longest_len:
                    longest_start_element = curr_start_element
                    longest_len = curr_len
    
                curr_len = 0
                curr_start_element = -1
    
            elif curr_start_element == -1:
                curr_start_element = element_counter
    
            curr_len += 1
    
        #just in case the last element makes the longest
        if curr_len > longest_len:
            longest_start_element = curr_start_element
            longest_len = curr_len
    
    
        #step 5
        return result[longest_start_element:longest_start_element + longest_len-1]
    
    0 讨论(0)
提交回复
热议问题