Easy interview question got harder: given numbers 1..100, find the missing number(s) given exactly k are missing

前端 未结 30 2008
时光说笑
时光说笑 2020-11-22 07:02

I had an interesting job interview experience a while back. The question started really easy:

Q1: We have a bag containing numbers

30条回答
  •  清歌不尽
    2020-11-22 08:00

    There is a general way to generalize streaming algorithms like this. The idea is to use a bit of randomization to hopefully 'spread' the k elements into independent sub problems, where our original algorithm solves the problem for us. This technique is used in sparse signal reconstruction, among other things.

    • Make an array, a, of size u = k^2.
    • Pick any universal hash function, h : {1,...,n} -> {1,...,u}. (Like multiply-shift)
    • For each i in 1, ..., n increase a[h(i)] += i
    • For each number x in the input stream, decrement a[h(x)] -= x.

    If all of the missing numbers have been hashed to different buckets, the non-zero elements of the array will now contain the missing numbers.

    The probability that a particular pair is sent to the same bucket, is less than 1/u by definition of a universal hash function. Since there are about k^2/2 pairs, we have that the error probability is at most k^2/2/u=1/2. That is, we succeed with probability at least 50%, and if we increase u we increase our chances.

    Notice that this algorithm takes k^2 logn bits of space (We need logn bits per array bucket.) This matches the space required by @Dimitris Andreou's answer (In particular the space requirement of polynomial factorization, which happens to also be randomized.) This algorithm also has constant time per update, rather than time k in the case of power-sums.

    In fact, we can be even more efficient than the power sum method by using the trick described in the comments.

提交回复
热议问题