Find shortest subarray containing all elements

后端未结

关注

 6  663

Suppose you have an array of numbers, and another set of numbers. You have to find the shortest subarray containing all numbers with minimal complexity.

The array ca

相关标签:

6条回答

梦毁少年i

2020-12-13 05:21

Say the array has n elements, and set has m elements

Sort the array, noting the reverse index (position in the original array)
// O (n log n) time

for each element in given set
   find it in the array
// O (m log n) time  - log n for binary serch, m times

keep track of the minimum and maximum index for each found element

min - max defines your range

Total time complexity: O ((m+n) log n)

0 讨论(0)

别跟我提以往

2020-12-13 05:23

This sounds like a problem that is well-suited for a sliding window approach: maintain a window (a subarray) that is gradually expanding and contracting, and use a hashmap to keep track of the number of times each "interesting" number occurs in the window. E.g. start with an empty window, then expand it to include only element 0, then elements 0-1, then 0-2, 0-3, and so on, by adding subsequent elements (and using the hashmap to keep track of which numbers exist in the window). When the hashmap tells you that all interesting numbers exist in the window, you can begin contracting it: e.g. 0-5, 1-5, 2-5, etc., until you find out that the window no longer contains all interesting numbers. Then, you can begin expanding it on the right hand side again, and so on. I'm quite (but not entirely) sure that this would work for your problem, and it can be implemented to run in linear time.

0 讨论(0)
发布评论:

提交评论
- 加载中...

野性不改

2020-12-13 05:23

This solution definitely does not run in O(n) time as suggested by some of the pseudocode above, however it is real (Python) code that solves the problem and by my estimates runs in O(n^2):

def small_sub(A, B):
    len_A = len(A)
    len_B = len(B)

    sub_A = []
    sub_size = -1
    dict_b = {}

    for elem in B:
        if elem in dict_b:
            dict_b[elem] += 1
        else:
            dict_b.update({elem: 1})

    for i in range(0, len_A - len_B + 1):
        if A[i] in dict_b:
            temp_size, temp_sub = find_sub(A[i:], dict_b.copy())

            if (sub_size == -1 or (temp_size != -1 and temp_size < sub_size)):
                sub_A = temp_sub
                sub_size = temp_size

    return sub_size, sub_A

def find_sub(A, dict_b):
    index = 0
    for i in A:
        if len(dict_b) == 0:
            break

        if i in dict_b:
            dict_b[i] -= 1

            if dict_b[i] <= 0:
                del(dict_b[i])

        index += 1

    if len(dict_b) > 0:
        return -1, {}
    else:
        return index, A[0:index]

0 讨论(0)

情书的邮戳

2020-12-13 05:29

Here's how I solved this problem in linear time using collections.Counter objects

from collections import Counter

def smallest_subsequence(stream, search):
    if not search:
        return []  # the shortest subsequence containing nothing is nothing

    stream_counts = Counter(stream)
    search_counts = Counter(search)

    minimal_subsequence = None

    start = 0
    end = 0
    subsequence_counts = Counter()

    while True:
        # while subsequence_counts doesn't have enough elements to cancel out every
        # element in search_counts, take the next element from search
        while search_counts - subsequence_counts:
            if end == len(stream):  # if we've reached the end of the list, we're done
                return minimal_subsequence
            subsequence_counts[stream[end]] += 1
            end += 1

        # while subsequence_counts has enough elements to cover search_counts, keep
        # removing from the start of the sequence
        while not search_counts - subsequence_counts:
            if minimal_subsequence is None or (end - start) < len(minimal_subsequence):
                minimal_subsequence = stream[start:end]
            subsequence_counts[stream[start]] -= 1
            start += 1

print(smallest_subsequence([1, 2, 5, 8, 7, 6, 2, 6, 5, 3, 8, 5], [5, 7]))
# [5, 8, 7]

0 讨论(0)

情歌与酒

2020-12-13 05:34

Java solution

    List<String> paragraph = Arrays.asList("a", "c", "d", "m", "b", "a");
    Set<String> keywords = Arrays.asList("a","b");

    Subarray result = new Subarray(-1,-1);

    Map<String, Integer> keyWordFreq = new HashMap<>();

    int numKeywords = keywords.size();

    // slide the window to contain the all the keywords**
    // starting with [0,0]
    for (int left = 0, right = 0 ; right < paragraph.size() ; right++){

      // expand right to contain all the keywords
      String currRight = paragraph.get(right);

      if (keywords.contains(currRight)){
        keyWordFreq.put(currRight, keyWordFreq.get(currRight) == null ? 1 : keyWordFreq.get(currRight) + 1);
      }

      // loop enters when all the keywords are present in the current window
      // contract left until the all the keywords are still present
      while (keyWordFreq.size() == numKeywords){
        String currLeft = paragraph.get(left);

        if (keywords.contains(currLeft)){

          // remove from the map if its the last available so that loop exists
          if (keyWordFreq.get(currLeft).equals(1)){

            // now check if current sub array is the smallest
            if((result.start == -1 && result.end == -1) || (right - left) < (result.end - result.start)){
              result = new Subarray(left, right);
            }
            keyWordFreq.remove(currLeft);
          }else {

            // else reduce the frequcency
            keyWordFreq.put(currLeft, keyWordFreq.get(currLeft) - 1);
          }
        }
        left++;
      }

    }

    return result;
  }

0 讨论(0)

难免孤独

2020-12-13 05:39
Proof of a linear-time solution

I will write right-extension to mean increasing the right endpoint of a range by 1, and left-contraction to mean increasing the left endpoint of a range by 1. This answer is a slight variation of Aasmund Eldhuset's answer. The difference here is that once we find the smallest j such that [0, j] contains all interesting numbers, we thereafter consider only ranges that contain all interesting numbers. (It's possible to interpret Aasmund's answer this way, but it's also possible to interpret it as allowing a single interesting number to be lost due to a left-contraction -- an algorithm whose correctness has yet to be established.)

The basic idea is that for each position j, we will find the shortest satisfying range ending at position j, given that we know the shortest satisfying range ending at position j-1.

EDIT: Fixed a glitch in the base case.

Base case: Find the smallest j' such that [0, j'] contains all interesting numbers. By construction, there can be no ranges [0, k < j'] that contain all interesting numbers so we don't need to worry about them further. Now find the ~~smallest~~largest i such that [i, j'] contains all interesting numbers (i.e. hold j' fixed). This is the smallest satisfying range ending at position j'.

To find the smallest satisfying range ending at any arbitrary position j, we can right-extend the smallest satisfying range ending at position j-1 by 1 position. This range will necessarily also contain all interesting numbers, though it may not be minimal-length. The fact that we already know this is a satisfying range means that we don't have to worry about extending the range "backwards" to the left, since that can only increase the range over its minimal length (i.e. make the solution worse). The only operations we need to consider are left-contractions that preserve the property of containing all interesting numbers. So the left endpoint of the range should be advanced as far as possible while this property holds. When no more left-contractions can be performed, we have the minimal-length satisfying range ending at j (since further left-contractions clearly cannot make the range satisfying again) and we are done.

Since we perform this for each rightmost position j, we can take the minimum-length range over all rightmost positions to find the overall minimum. This can be done using a nested loop in which j advances on each outer loop cycle. Clearly j advances by 1 n times. Since at any point in time we only ever need the leftmost position of the best range for the previous value of j, we can store this in i and just update it as we go. i starts at 0, is at all times <= j <= n, and only ever advances upwards by 1, meaning it can advance at most n times. Both i and j advance at most n times, meaning that the algorithm is linear-time.

In the following pseudo-code, I've combined both phases into a single loop. We only try to contract the left side if we have reached the stage of having all interesting numbers:
```
# x[0..m-1] is the array of interesting numbers.
# Load them into a hash/dictionary:
For i from 0 to m-1:
    isInteresting[x[i]] = 1

i = 0
nDistinctInteresting = 0
minRange = infinity
For j from 0 to n-1:
    If count[a[j]] == 0 and isInteresting[a[j]]:
        nDistinctInteresting++
    count[a[j]]++

    If nDistinctInteresting == m:
        # We are in phase 2: contract the left side as far as possible
        While count[a[i]] > 1 or not isInteresting[a[i]]:
            count[a[i]]--
            i++

        If j - i < minRange:
            (minI, minJ) = (i, j)
```
count[] and isInteresting[] are hashes/dictionaries (or plain arrays if the numbers involved are small).
0 讨论(0)
发布评论:

提交评论
- 加载中...

Find shortest subarray containing all elements

Proof of a linear-time solution