Multiple aggregate functions in Java 8 Stream API

时光总嘲笑我的痴心妄想 提交于 2020-01-01 04:55:23

问题


I have a class defined like

public class TimePeriodCalc {
    private double occupancy;
    private double efficiency;
    private String atDate;
}

I would like to perform the following SQL statement using Java 8 Stream API.

SELECT atDate, AVG(occupancy), AVG(efficiency)
FROM TimePeriodCalc
GROUP BY atDate

I tried :

Collection<TimePeriodCalc> collector = result.stream().collect(groupingBy(p -> p.getAtDate(), ....

What can be put into the code to select multiple attributes ? I'm thinking of using multiple Collectors but really don't know how to do so.


回答1:


To do it without a custom Collector (not streaming again on the result), you could do it like this. It's a bit dirty, since it is first collecting to Map<String, List<TimePeriodCalc>> and then streaming that list and get the average double.

Since you need two averages, they are collected to a Holder or a Pair, in this case I'm using AbstractMap.SimpleEntry

  Map<String, SimpleEntry<Double, Double>> map = Stream.of(new TimePeriodCalc(12d, 10d, "A"), new TimePeriodCalc(2d, 16d, "A"))
            .collect(Collectors.groupingBy(TimePeriodCalc::getAtDate,
                    Collectors.collectingAndThen(Collectors.toList(), list -> {
                        double occupancy = list.stream().collect(
                                Collectors.averagingDouble(TimePeriodCalc::getOccupancy));
                        double efficiency = list.stream().collect(
                                Collectors.averagingDouble(TimePeriodCalc::getEfficiency));
                        return new AbstractMap.SimpleEntry<>(occupancy, efficiency);
                    })));

    System.out.println(map);



回答2:


Assuming that your TimePeriodCalc class has all the necessary getters, this should get you the list you want:

List<TimePeriodCalc> result = new ArrayList<>(
    list.stream()
    .collect(Collectors.groupingBy(TimePeriodCalc::getAtDate, 
        Collectors.collectingAndThen(Collectors.toList(), TimePeriodCalc::avgTimePeriodCalc)))
    .values()
);

Where TimePeriodCalc.avgTimePeriodCalc is this method in the TimePeriodCalc class:

public static TimePeriodCalc avgTimePeriodCalc(List<TimePeriodCalc> list){
    return new TimePeriodCalc(
            list.stream().collect(Collectors.averagingDouble(TimePeriodCalc::getOccupancy)),
            list.stream().collect(Collectors.averagingDouble(TimePeriodCalc::getEfficiency)),
            list.get(0).getAtDate()
            );
}

The above can be combined into this monstrosity:

List<TimePeriodCalc> result = new ArrayList<>(
    list.stream()
    .collect(Collectors.groupingBy(TimePeriodCalc::getAtDate, 
        Collectors.collectingAndThen(
            Collectors.toList(), a -> {
                return new TimePeriodCalc(
                        a.stream().collect(Collectors.averagingDouble(TimePeriodCalc::getOccupancy)),
                        a.stream().collect(Collectors.averagingDouble(TimePeriodCalc::getEfficiency)),
                        a.get(0).getAtDate()
                        );
            }
        )))
    .values());

With input:

List<TimePeriodCalc> list = new ArrayList<>();
list.add(new TimePeriodCalc(10,10,"a"));
list.add(new TimePeriodCalc(10,10,"b"));
list.add(new TimePeriodCalc(10,10,"c"));
list.add(new TimePeriodCalc(5,5,"a"));
list.add(new TimePeriodCalc(0,0,"b"));

This would give:

TimePeriodCalc [occupancy=7.5, efficiency=7.5, atDate=a]
TimePeriodCalc [occupancy=5.0, efficiency=5.0, atDate=b]
TimePeriodCalc [occupancy=10.0, efficiency=10.0, atDate=c]



回答3:


Here's a way with a custom collector. It only needs one pass, but it's not very easy, especially because of generics...

If you have this method:

@SuppressWarnings("unchecked")
@SafeVarargs
static <T, A, C extends Collector<T, A, Double>> Collector<T, ?, List<Double>>
averagingManyDoubles(ToDoubleFunction<? super T>... extractors) {

    List<C> collectors = Arrays.stream(extractors)
        .map(extractor -> (C) Collectors.averagingDouble(extractor))
        .collect(Collectors.toList());

    class Acc {
        List<A> averages = collectors.stream()
            .map(c -> c.supplier().get())
            .collect(Collectors.toList());

        void add(T elem) {
            IntStream.range(0, extractors.length).forEach(i ->
                collectors.get(i).accumulator().accept(averages.get(i), elem));
        }

        Acc merge(Acc another) {
            IntStream.range(0, extractors.length).forEach(i ->
                averages.set(i, collectors.get(i).combiner()
                    .apply(averages.get(i), another.averages.get(i))));
            return this;
        }

        List<Double> finish() {
            return IntStream.range(0, extractors.length)
                .mapToObj(i -> collectors.get(i).finisher().apply(averages.get(i)))
                .collect(Collectors.toList());
        }
    }
    return Collector.of(Acc::new, Acc::add, Acc::merge, Acc::finish);
}

This receives an array of functions that will extract double values from each element of the stream. These extractors are converted to Collectors.averagingDouble collectors and then the local Acc class is created with the mutable structures that are used to accumulate the averages for each collector. Then, the accumulator function forwards to each accumulator, and so with the combiner and finisher functions.

Usage is as follows:

Map<String, List<Double>> averages = list.stream()
    .collect(Collectors.groupingBy(
        TimePeriodCalc::getAtDate,
        averagingManyDoubles(
            TimePeriodCalc::getOccupancy,
            TimePeriodCalc::getEfficiency)));



回答4:


You can chain multiple attributes like this:

Collection<TimePeriodCalc> collector = result.stream().collect(Collectors.groupingBy(p -> p.getAtDate(), Collectors.averagingInt(p -> p.getOccupancy())));

If you want more, you get the idea.



来源:https://stackoverflow.com/questions/44942493/multiple-aggregate-functions-in-java-8-stream-api

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!