Uniquely identify an array of numbers whose order doesn't matter

家住魔仙堡 提交于 2019-12-11 04:37:02

问题


Let's say I have these 3 integer arrays:

int[] a = {1, 2, 3};
int[] b = {3, 4, 5};
int[] c = (2, 1, 3};

I'm looking for the most efficient code that will consider a the same as c (because they contain the same numbers but in a different order), but consider a different from b, and b different from c.

I know I can sort them all so that c becomes {1, 2, 3} and therefore the same as a, but I'm comparing hundreds of arrays with more than three numbers each and I don't want my program to sort each one of them, I'm thinking there must be a better way.

Also, taking the sum, for example, wouldn't work because the sum of numbers in {1, 4, 5} is the same as that of numbers in {1, 3, 6}.

And the product wouldn't work either because the product of numbers in {1, 2, 6} is the same as that of numbers in {1, 3, 4}.


回答1:


Sorting is an O(nlog(n) operation (in the worst case). You could, instead, have an O(n) solution by running over both arrays and just counting the elements in it:

public static boolean hasSameElements(int[] a, int[] b) {
    return countElements(a).equals(countElements(b);)
}

private static Map<Integer, Long> countElements(int[] arr) {
    return Arrays.stream(arr)
                 .boxed()
                 .collect(Collectors.groupingBy(Function.identity(), 
                          Collectors.counting()));
}

EDIT:
While it won't change the big-O notation of the algorithm, a slightly less elegant solution could perform better for non-matching arrays by failing fast:

public static boolean hasSameElements(int[] a, int[] b) {
    if (a.length != b.length) {
        return false;
    }

    Map<Integer, Long> popCount =
            Arrays.stream(a)
                  .mapToObj(Integer::valueOf)
                  .collect(Collectors.groupingBy(Function.identity(), 
                           Collectors.counting()));

    for (int elem : b) {
        Long count = popCount.get(elem);
        if (count == null) {
            return false;
        }
        count--;
        if (count == 0L) {
            popCount.remove(elem);
        } else {
            popCount.put(elem, count);
        }
    }

    return popCount.isEmpty();
}



回答2:


I assume that the arrays do not have any internal duplicates like, {1, 2, 2, 4}. First check that the lengths are the same. If they are the same length, then create a Set from the first array. Add the elements from the second array, A2, to the set one at a time. As each element from A2 is added, check if it was added and not rejected as a duplicate. If any element from A2 is not a duplicate, then the two arrays are not identical. If all elements from A2 are rejected as duplicates then the two arrays are identical in size and contents, though not necessarily in order.




回答3:


the idea of reducing all the numbers into the array to a single number for comparison is on the right track. This approach is called hashing. A good property of hashing is to avoid mapping multiple sets of inputs to the same output as much as possible. Addition and Multiplication are very poor hashing functions, and there is a large literature on good hashing functions. In java, you can use the Arrays.hashCode(int[]) function to get a good hashing of the array.

Of course, despite this hashing having good quality, there is no guarantee two arrays won't produce the same value. So, here is how you can implement a comparison function:

boolean equals(int[] a, int[]b) {
  if (a.length != b.length)
    return false;
  if (a == b)
    return true;
  if (Arrays.hashCode(a) != Arrays.hashCode(b)) 
    return false;
  for(int i=0; i< a.length; i++) {
    if (a[i] != b[i])
      return false;
  return true;
}

Note that the element-by-element comparison is executed only if the two parameters are not pointing to the same array, and the hashing is the same.

EDIT: it was pointed out to me that the Arrays.hashCode may produce different results for arrays with the same values in different orders, which is true, and in fact I was ignoring the ordering. Sorting the arrays first is O(MNlogN) in the average case (with M being the number of arrays and N the average length). However, if the number of repetition in the array is small the avg complexity is closer to O(MN) as sorting is needed only for arrays with the same length and hash code, which is very unlikely.

boolean equals(int[] a, int[]b) {
  if (a.length != b.length)
    return false;
  if (a == b)
    return true;
  if (hashCode(a) != hashCode(b)) 
    return false;
  Arrays.sort(a);
  Arrays.sort(b);
  for(int i=0; i< a.length; i++) {
    if (a[i] != b[i])
      return false;
  return true;
}

private int hashCode(int[] a) {
  int res = 0;
  for(int i=0; i<a.length; i++) {
    res ^= a[i]
  }
  return res;
}


来源:https://stackoverflow.com/questions/41202980/uniquely-identify-an-array-of-numbers-whose-order-doesnt-matter

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!