How many times update maximum gets called when trying to find the running maximum in a randomized array of numbers?

為{幸葍}努か 提交于 2019-12-22 18:11:48

问题


Suppose we have an array with integers -N to N in an array size of 2N + 1. We first shuffle the elements in the array, then try to find the maximum integer by iterating through the array from the first element to the last element: (code example is in Java)

int called = 0;
int max = Integer.MIN_VALUE;
for (int i : array) {
    if (i > max) {
        called++;
        max = i;
    }
}

What is the expectation(average over many runs) of value called after iterating through the array?

Edit:

How I found it to be close to ln(array.length):

public static void main(String args[]) {
    List<Integer> list = new ArrayList<>();
    for (int i = 0; i < 1000000; i++) list.add(i);

    int called = 0;
    int runs = 100;

    for (int i = 1; i <= runs; i++) {
        Collections.shuffle(list);
        int max = -1;
        for (int num : list) {
            if (num > max) {
                called++;
                max = num;
            }
        }
    }
    System.out.println("Expectation: " + called/runs);
}

回答1:


Let's consider randomly shuffled array. We want to estimate K - number of times than i-th element is bigger than all of its predecessors.

Expected value of K equals to sum of probabilities that i-th element is bigger than all its predecessors. E(K) = Σ12N+1 Pi.

Now we want to find Pi. Consider prefix of length i for our array. Probability that the last element in the prefix is the biggest is 1/i. So we have Pi = 1/i.

Consequently:E(K) = Σ12N+11/i. We can estimate this sum via definite integral as ln(2N+1)+O(1).

So, the answer is ln(2N+1)+O(1).

And Monte-Carlo simulation for kinda proof:

constexpr size_t N = 1000;
std::array<int, 2 * N + 1> arr;
std::iota(arr.begin(), arr.end(), -N);
std::random_device rd;
std::mt19937 g(rd());
double moving = 0;
for (double trial = 1; trial < 10001; ++trial) {
    std::shuffle(arr.begin(), arr.end(), g);
    int called = 0;
    int max = std::numeric_limits<int>::min();
    for (int i = 1; i < arr.size(); ++i) {
        if (arr[i] > max) {
            ++called;
            max = arr[i];
        }
    }
    if (trial > 1) {
        moving = moving * ((trial - 1) / trial) + called / trial;
    }
    else {
        moving = called;
    }
}
cout << "actual:   " << moving << endl;
cout << "expected: " << std::log(2 * N + 1) << " + O(1)" << endl;
actual:   8.1581   
expected: 7.6014 + O(1)



回答2:


Originally I came across this problem during a software engineer interview. Here is what I found that could be used to solve this problem:

Cycle Notation

Stirling numbers of the first kind

As Yola and rici have pointed out, the answer should be ln(n) + γ (where γ is the Euler-Mascheroni constant and n is the number of elements in the array).



来源:https://stackoverflow.com/questions/49267640/how-many-times-update-maximum-gets-called-when-trying-to-find-the-running-maximu

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!