How do aggregate operations work on Java streams?

浪尽此生 提交于 2019-12-24 00:26:55

问题


In the following excerpt from the Java tutorials in aggregate operations, we map the names of the people to their sexes.

Map<Person.Sex, List<String>> namesByGender =
roster
    .stream()
    .collect(
        Collectors.groupingBy(
            Person::getGender,                      
            Collectors.mapping(
                Person::getName,
                Collectors.toList())));

I understand that the collect operation:
1) Groups each Person in the stream by the result of getGender.
2) Maps each Person to the result of getName.
3) Forms a list from the results and.
4) Generates a Map whose keys are the Persons' genders and whose data values are the Persons' names.

My questions are:
1) In what order do the Collectors act?
2) What are the intermittent types between them?


回答1:


The Collectors do not “act”, hence do not act in an order. It’s not like one Collector processes all data before passing it to another using an intermittent type.

You are composing a single Collector with these factories, which will do the entire work at once when being passed to Stream.collect.

As the documentation of the Collector interface explains:

A Collector is specified by four functions that work together to accumulate entries into a mutable result container, and optionally perform a final transform on the result. They are:

  • creation of a new result container (supplier())
  • incorporating a new data element into a result container (accumulator())
  • combining two result containers into one (combiner())
  • performing an optional final transform on the container (finisher())

So the toList() collector can be implemented as simple as ArrayList::new as supplier, List::add as accumulator and List::addAll as combiner, not needing a custom finisher, which is how it is indeed implemented in the reference implementation, but that’s an implementation detail, other implementations are allowed.

Then, Collectors.mapping is composing a new collector using the specified downstream collector and decorating it’s accumulator function by applying the specified mapper function first and passing its result to the original accumulator function. The result is again a collector consisting of four functions. During the collect operation, the mapper function will be applied for each element right before adding to the list.

Finally, Collectors.groupingBy will do a much more complex composition. The supplier will create a new map, typically HashMap. The accumulator will evaluate your grouping function and store the result into the map, using an operation like Map.computeIfAbsent which will evaluate the downstream collector’s supplier if the key is new, followed by applying the downstream collector’s accumulator function, which is the composed function in your scenario. The combiner function settles on Map.merge using either map’s value if only one contains a key or using the downstream’s combiner if a key is present in both maps.

So the processing of a composed collector consists of an interleaved processing of the specified functions rather than processing one collector after the other. In other words, for a sequential execution, the operation will be equivalent to

Map<Person.Sex, List<String>> namesByGender = new HashMap<>();
for(Person p: roster)
    namesByGender.computeIfAbsent(p.getGender(), k -> new ArrayList()).add(p.getName());



回答2:


If we look at groupingBy's sources we'll see following:

Supplier<A> downstreamSupplier = downstream.supplier();
        BiConsumer<A, ? super T> downstreamAccumulator = downstream.accumulator();
        BiConsumer<Map<K, A>, T> accumulator = (m, t) -> {
            K key = Objects.requireNonNull(classifier.apply(t), "element cannot be mapped to a null key");
            A container = m.computeIfAbsent(key, k -> downstreamSupplier.get());
            downstreamAccumulator.accept(container, t);
        };

First. Computes key calling Person::getGender.

Second. If key doesn't exist creates new downstream container ArrayList::new

Third. Add element returns from Person::getName to container List::add

ArrayList::new and List::add we can find as parameters of CollectorImpl constructor if we look at Collectors.toList method



来源:https://stackoverflow.com/questions/47125135/how-do-aggregate-operations-work-on-java-streams

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!