Fastest way to recreate the ArrayList in a for loop

前端 未结 4 708
野性不改
野性不改 2020-11-29 13:24

In Java, using the following function for a huge matrix X to print its column-distinct elements:

// create the list of distinct values
List va         


        
4条回答
  •  無奈伤痛
    2020-11-29 13:47

    What would be much more efficient would be to use a Set instead of a list, for example the HashSet implementation. The contains method will run in O(1) instead of O(n) with a list. And you could save one call by only calling the add method.

    As for your specific question, I would just create a new Set at each loop - object creation is not that expensive, probably less than clearing the set (as confirmed by the benchmark at the bottom - see the most efficient version in EDIT 2):

    for (int j = 0, x; j < m; j++) {
        Set values = new HashSet();
        for (int i = 0; i < n; i++) {
            x = X[i][j];
            if (!values.add(x)) continue; //value.add returns true if the element was NOT in the set before
            System.out.println(x);
        }
    }
    

    However, the only way to know which is quicker (new object vs. clear) is to profile that portion of your code and check the performance of both versions.

    EDIT

    I ran a quick benchmark and the clear version seems a little faster than creating a set at each loop (by about 20%). You should still check on your dataset / use case which one is better. Faster code with my dataset:

    Set values = new HashSet();
    for (int j = 0, x; j < m; j++) {
        for (int i = 0; i < n; i++) {
            x = X[i][j];
            if (!values.add(x)) continue; //value.add returns true if the element was NOT in the set before
            System.out.println(x);
        }
        values.clear();
    }
    

    EDIT 2

    An actually even faster version of the code is obtained by creating a new set of the right size at each loop:

    for (int j = 0, x; j < m; j++) {
        Set values = new HashSet(n, 1); //right size from the beginning
        for (int i = 0; i < n; i++) {
            x = X[i][j];
            if (!values.add(x)) continue; //value.add returns true if the element was NOT in the set before
            System.out.println(x);
        }
    }
    

    Summary of result

    After JVM warm up + JIT:

    Set values = new HashSet(n, 1); =====> 280 ms
    values.clear();                                   =====> 380 ms
    Set values = new HashSet();     =====> 450 ms 
    

提交回复
热议问题