Java 8 Streams - collect vs reduce

前端 未结 7 1776
天涯浪人
天涯浪人 2020-11-28 01:17

When would you use collect() vs reduce()? Does anyone have good, concrete examples of when it\'s definitely better to go one way or the other?

7条回答
  •  盖世英雄少女心
    2020-11-28 01:59

    They are very different in the potential memory footprint during the runtime. While collect() collects and puts all data into the collection, reduce() explicitly asks you to specify how to reduce the data that made it through the stream.

    For example, if you want to read some data from a file, process it, and put it into some database, you might end up with java stream code similar to this:

    streamDataFromFile(file)
                .map(data -> processData(data))
                .map(result -> database.save(result))
                .collect(Collectors.toList());
    

    In this case, we use collect() to force java to stream data through and make it save the result into the database. Without collect() the data is never read and never stored.

    This code happily generates a java.lang.OutOfMemoryError: Java heap space runtime error, if the file size is large enough or the heap size is low enough. The obvious reason is that it tries to stack all the data that made it through the stream (and, in fact, has already been stored in the database) into the resulting collection and this blows up the heap.

    However, if you replace collect() with reduce() -- it won't be a problem anymore as the latter will reduce and discard all the data that made it through.

    In the presented example, just replace collect() with something with reduce:

    .reduce(0L, (aLong, result) -> aLong, (aLong1, aLong2) -> aLong1);
    

    You do not need even to care to make the calculation depend on the result as Java is not a pure FP (functional programming) language and cannot optimize out the data that is not being used at the bottom of the stream because of the possible side-effects.

提交回复
热议问题