Sorting a List in parallel without creating a temporary array in Java 8

后端 未结 5 825
小鲜肉
小鲜肉 2020-12-03 13:13

Java 8 provides java.util.Arrays.parallelSort, which sorts arrays in parallel using the fork-join framework. But there\'s no corresponding Collections.parallelSort

5条回答
  •  感情败类
    2020-12-03 13:52

    Just speculating here, but I see several good reasons for generic sort algorithms preferring to work on arrays instead of List instances:

    • Element access is performed via method calls. Despite all the optimizations JIT can apply, even for a list that implements RandomAccess, this probably means a lot of overhead compared to plain array accesses which can be optimized very well.
    • Many algorithms require copying some fragments of the array to temporary structures. There are efficient methods for copying arrays or their fragments. An arbitrary List instance on the other hand, can't be easily copied. New lists would have to be allocated which poses two problems. First, this means allocating some new objects which is likely more costly than allocating arrays. Second, the algorithm would have to choose what implementation of List should be allocated for this temporary structure. There are two obvious solutions, both bad: either just choose some hard-coded implementation, e.g. ArrayList, but then it could just allocate simple arrays as well (and if we're generating arrays then it's much easier if the soiurce is also an array). Or, let the user provide some list factory object, which makes the code much more complicated.
    • Related to the previous issue: there is no obvious way of copying a list into another due to how the API is designed. The best the List interface offers is addAll() method, but this is probably not efficient for most cases (think of pre-allocating the new list to its target size vs adding elements one by one which many implementations do).
    • Most lists that need to be sorted will be small enough for another copy to not be an issue.

    So probably the designers thought of CPU efficiency and code simplicity most of all, and this is easily achieved when the API accepts arrays. Some languages, e.g. Scala, have sort methods that work directly on lists, but this comes at a cost and probably is less efficient than sorting arrays in many cases (or sometimes there will probably just be a conversion to and from array performed behind the scenes).

提交回复
热议问题