Algorithm to determine if array contains n…n+m?

前端 未结 30 3174
清酒与你
清酒与你 2020-11-28 01:45

I saw this question on Reddit, and there were no positive solutions presented, and I thought it would be a perfect question to ask here. This was in a thread about interview

30条回答
  •  盖世英雄少女心
    2020-11-28 02:09

    Awhile back I heard about a very clever sorting algorithm from someone who worked for the phone company. They had to sort a massive number of phone numbers. After going through a bunch of different sort strategies, they finally hit on a very elegant solution: they just created a bit array and treated the offset into the bit array as the phone number. They then swept through their database with a single pass, changing the bit for each number to 1. After that, they swept through the bit array once, spitting out the phone numbers for entries that had the bit set high.

    Along those lines, I believe that you can use the data in the array itself as a meta data structure to look for duplicates. Worst case, you could have a separate array, but I'm pretty sure you can use the input array if you don't mind a bit of swapping.

    I'm going to leave out the n parameter for time being, b/c that just confuses things - adding in an index offset is pretty easy to do.

    Consider:

    for i = 0 to m
      if (a[a[i]]==a[i]) return false; // we have a duplicate
      while (a[a[i]] > a[i]) swapArrayIndexes(a[i], i)
      sum = sum + a[i]
    next
    
    if sum = (n+m-1)*m return true else return false
    

    This isn't O(n) - probably closer to O(n Log n) - but it does provide for constant space and may provide a different vector of attack for the problem.

    If we want O(n), then using an array of bytes and some bit operations will provide the duplication check with an extra n/32 bytes of memory used (assuming 32 bit ints, of course).

    EDIT: The above algorithm could be improved further by adding the sum check to the inside of the loop, and check for:

    if sum > (n+m-1)*m return false
    

    that way it will fail fast.

提交回复
热议问题