Algorithm: efficient way to remove duplicate integers from an array

前端 未结 30 2597
离开以前
离开以前 2020-11-22 16:03

I got this problem from an interview with Microsoft.

Given an array of random integers, write an algorithm in C that removes duplicated numbers an

30条回答
  •  余生分开走
    2020-11-22 16:15

    1. Using O(1) extra space, in O(n log n) time

    This is possible, for instance:

    • first do an in-place O(n log n) sort
    • then walk through the list once, writing the first instance of every back to the beginning of the list

    I believe ejel's partner is correct that the best way to do this would be an in-place merge sort with a simplified merge step, and that that is probably the intent of the question, if you were eg. writing a new library function to do this as efficiently as possible with no ability to improve the inputs, and there would be cases it would be useful to do so without a hash-table, depending on the sorts of inputs. But I haven't actually checked this.

    2. Using O(lots) extra space, in O(n) time

    • declare a zero'd array big enough to hold all integers
    • walk through the array once
    • set the corresponding array element to 1 for each integer.
    • If it was already 1, skip that integer.

    This only works if several questionable assumptions hold:

    • it's possible to zero memory cheaply, or the size of the ints are small compared to the number of them
    • you're happy to ask your OS for 256^sizepof(int) memory
    • and it will cache it for you really really efficiently if it's gigantic

    It's a bad answer, but if you have LOTS of input elements, but they're all 8-bit integers (or maybe even 16-bit integers) it could be the best way.

    3. O(little)-ish extra space, O(n)-ish time

    As #2, but use a hash table.

    4. The clear way

    If the number of elements is small, writing an appropriate algorithm is not useful if other code is quicker to write and quicker to read.

    Eg. Walk through the array for each unique elements (ie. the first element, the second element (duplicates of the first having been removed) etc) removing all identical elements. O(1) extra space, O(n^2) time.

    Eg. Use library functions which do this. efficiency depends which you have easily available.

提交回复
热议问题