Is it faster to add to a collection then sort it, or add to a sorted collection?

前端 未结 7 1597
忘掉有多难
忘掉有多难 2020-11-28 04:36

If I have a Map like this:

HashMap map;

and I want to obtain a collection of values sorted us

7条回答
  •  無奈伤痛
    2020-11-28 05:00

    Why not use the best of both worlds? If you are never using it again, sort using a TreeSet and initialize an ArrayList with the contents

    List sortedCollection = 
        new ArrayList( 
              new TreeSet(map.values()));
    

    EDIT:

    I have created a benchmark (you can access it at pastebin.com/5pyPMJav) to test the three approaches (ArrayList + Collections.sort, TreeSet and my best of both worlds approach) and mine always wins. The test file creates a map with 10000 elements, the values of which have an intentionally awful comparator, and then each of the three strategies get a chance to a) sort the data and b) iterate over it. Here is some sample output (you can test it yourselves):

    EDIT: I have added an aspect that logs calls to Thingy.compareTo(Thingy) and I have also added a new Strategy based on PriorityQueues that is much faster than either of the previous solutions (at least in sorting).

    compareTo() calls:123490
    Transformer ArrayListTransformer
        Creation: 255885873 ns (0.255885873 seconds) 
        Iteration: 2582591 ns (0.002582591 seconds) 
        Item count: 10000
    
    compareTo() calls:121665
    Transformer TreeSetTransformer
        Creation: 199893004 ns (0.199893004 seconds) 
        Iteration: 4848242 ns (0.004848242 seconds) 
        Item count: 10000
    
    compareTo() calls:121665
    Transformer BestOfBothWorldsTransformer
        Creation: 216952504 ns (0.216952504 seconds) 
        Iteration: 1604604 ns (0.001604604 seconds) 
        Item count: 10000
    
    compareTo() calls:18819
    Transformer PriorityQueueTransformer
        Creation: 35119198 ns (0.035119198 seconds) 
        Iteration: 2803639 ns (0.002803639 seconds) 
        Item count: 10000
    

    Strangely, my approach performs best in iteration (I would have thought there would be no differences to the ArrayList approach in iteration, do I have a bug in my benchmark?)

    Disclaimer: I know this is probably an awful benchmark, but it helps get the point across to you and I certainly did not manipulate it to make my approach win.

    (The code has a dependency to apache commons / lang for the equals / hashcode / compareTo builders, but it should be easy to refactor it out)

提交回复
热议问题