How to efficiently remove duplicates from an array without using Set

后端 未结 30 2996
情深已故
情深已故 2020-11-22 07:29

I was asked to write my own implementation to remove duplicated values in an array. Here is what I have created. But after tests with 1,000,000 elements it took very long ti

30条回答
  •  梦谈多话
    2020-11-22 07:57

    You need to sort your array then then loop and remove duplicates. As you cannot use other tools you need to write be code yourself.

    You can easily find examples of quicksort in Java on the internet (on which this example is based).

    public static void main(String[] args) throws Exception {
        final int[] original = new int[]{1, 1, 2, 8, 9, 8, 4, 7, 4, 9, 1};
        System.out.println(Arrays.toString(original));
        quicksort(original);
        System.out.println(Arrays.toString(original));
        final int[] unqiue = new int[original.length];
        int prev = original[0];
        unqiue[0] = prev;
        int count = 1;
        for (int i = 1; i < original.length; ++i) {
            if (original[i] != prev) {
                unqiue[count++] = original[i];
            }
            prev = original[i];
        }
        System.out.println(Arrays.toString(unqiue));
        final int[] compressed = new int[count];
        System.arraycopy(unqiue, 0, compressed, 0, count);
        System.out.println(Arrays.toString(compressed));
    }
    
    private static void quicksort(final int[] values) {
        if (values.length == 0) {
            return;
        }
        quicksort(values, 0, values.length - 1);
    }
    
    private static void quicksort(final int[] values, final int low, final int high) {
        int i = low, j = high;
        int pivot = values[low + (high - low) / 2];
        while (i <= j) {
            while (values[i] < pivot) {
                i++;
            }
            while (values[j] > pivot) {
                j--;
            }
            if (i <= j) {
                swap(values, i, j);
                i++;
                j--;
            }
        }
        if (low < j) {
            quicksort(values, low, j);
        }
        if (i < high) {
            quicksort(values, i, high);
        }
    }
    
    private static void swap(final int[] values, final int i, final int j) {
        final int temp = values[i];
        values[i] = values[j];
        values[j] = temp;
    }
    

    So the process runs in 3 steps.

    1. Sort the array - O(nlgn)
    2. Remove duplicates - O(n)
    3. Compact the array - O(n)

    So this improves significantly on your O(n^3) approach.

    Output:

    [1, 1, 2, 8, 9, 8, 4, 7, 4, 9, 1]
    [1, 1, 1, 2, 4, 4, 7, 8, 8, 9, 9]
    [1, 2, 4, 7, 8, 9, 0, 0, 0, 0, 0]
    [1, 2, 4, 7, 8, 9]
    

    EDIT

    OP states values inside array doesn't matter really. But I can assume that range is between 0-1000. This is a classic case where an O(n) sort can be used.

    We create an array of size range +1, in this case 1001. We then loop over the data and increment the values on each index corresponding to the datapoint.

    We can then compact the resulting array, dropping values the have not been incremented. This makes the values unique as we ignore the count.

    public static void main(String[] args) throws Exception {
        final int[] original = new int[]{1, 1, 2, 8, 9, 8, 4, 7, 4, 9, 1, 1000, 1000};
        System.out.println(Arrays.toString(original));
        final int[] buckets = new int[1001];
        for (final int i : original) {
            buckets[i]++;
        }
        final int[] unique = new int[original.length];
        int count = 0;
        for (int i = 0; i < buckets.length; ++i) {
            if (buckets[i] > 0) {
                unique[count++] = i;
            }
        }
        final int[] compressed = new int[count];
        System.arraycopy(unique, 0, compressed, 0, count);
        System.out.println(Arrays.toString(compressed));
    }
    

    Output:

    [1, 1, 2, 8, 9, 8, 4, 7, 4, 9, 1, 1000, 1000]
    [1, 2, 4, 7, 8, 9, 1000]
    

提交回复
热议问题