Why don't primitive Stream have collect(Collector)?

后端 未结 5 1381
-上瘾入骨i
-上瘾入骨i 2020-12-14 06:36

I\'m writing a library for novice programmers so I\'m trying to keep the API as clean as possible.

One of the things my Library needs to do is perform some complex c

5条回答
  •  北荒
    北荒 (楼主)
    2020-12-14 07:18

    Mr. Geotz provided the definitive answer for why the decision was made not to include specialized Collectors, however, I wanted to further investigate how much this decision affected performance.

    I thought I would post my results as an answer.

    I used the jmh microbenchmark framework to time how long it takes to compute calculations using both kinds of Collectors over collections of sizes 1, 100, 1000, 100,000 and 1 million:

    @BenchmarkMode(Mode.AverageTime)
    @OutputTimeUnit(TimeUnit.NANOSECONDS)
    @State(Scope.Thread)
    public class MyBenchmark {
    
    @Param({"1", "100", "1000", "100000", "1000000"})
    public int size;
    
    List seqs;
    
    @Setup
    public void setup(){
        seqs = new ArrayList(size);
        Random rand = new Random();
        for(int i=0; i< size; i++){
            //these lengths are random but over 128 so no caching of Longs
            seqs.add(BusinessObjFactory.createOfRandomLength());
        }
    }
    @Benchmark
    public double objectCollector() {       
    
        return seqs.stream()
                    .map(BusinessObj::getLength)
                    .collect(MyUtil.myCalcLongCollector())
                    .getAsDouble();
    }
    
    @Benchmark
    public double primitiveCollector() {
    
        LongStream stream= seqs.stream()
                                        .mapToLong(BusinessObj::getLength);
        return MyUtil.myCalc(stream)        
                            .getAsDouble();
    }
    
    public static void main(String[] args) throws RunnerException{
        Options opt = new OptionsBuilder()
                            .include(MyBenchmark.class.getSimpleName())
                            .build();
    
        new Runner(opt).run();
    }
    
    }
    

    Here are the results:

    # JMH 1.9.3 (released 4 days ago)
    # VM invoker: /Library/Java/JavaVirtualMachines/jdk1.8.0_31.jdk/Contents/Home/jre/bin/java
    # VM options: 
    # Warmup: 20 iterations, 1 s each
    # Measurement: 20 iterations, 1 s each
    # Timeout: 10 min per iteration
    # Threads: 1 thread, will synchronize iterations
    # Benchmark mode: Average time, time/op
    # Benchmark: org.sample.MyBenchmark.objectCollector
    
    # Run complete. Total time: 01:30:31
    
    Benchmark                        (size)  Mode  Cnt          Score         Error  Units
    MyBenchmark.objectCollector           1  avgt  200        140.803 ±       1.425  ns/op
    MyBenchmark.objectCollector         100  avgt  200       5775.294 ±      67.871  ns/op
    MyBenchmark.objectCollector        1000  avgt  200      70440.488 ±    1023.177  ns/op
    MyBenchmark.objectCollector      100000  avgt  200   10292595.233 ±  101036.563  ns/op
    MyBenchmark.objectCollector     1000000  avgt  200  100147057.376 ±  979662.707  ns/op
    MyBenchmark.primitiveCollector        1  avgt  200        140.971 ±       1.382  ns/op
    MyBenchmark.primitiveCollector      100  avgt  200       4654.527 ±      87.101  ns/op
    MyBenchmark.primitiveCollector     1000  avgt  200      60929.398 ±    1127.517  ns/op
    MyBenchmark.primitiveCollector   100000  avgt  200    9784655.013 ±  113339.448  ns/op
    MyBenchmark.primitiveCollector  1000000  avgt  200   94822089.334 ± 1031475.051  ns/op
    

    As you can see, the primitive Stream version is slightly faster, but even when there are 1 million elements in the collection, it is only 0.05 seconds faster (on average).

    For my API I would rather keep to the cleaner Object Stream conventions and use the Boxed version since it is such a minor performance penalty.

    Thanks to everyone who shed insight into this issue.

提交回复
热议问题