Grouping by object value, counting and then setting group key by maximum object attribute

前端 未结 4 956
误落风尘
误落风尘 2020-11-28 14:53

I have managed to write a solution using Java 8 Streams API that first groups a list of object Route by its value and then counts the number of objects in each group. It ret

相关标签:
4条回答
  • 2020-11-28 15:24

    Changed equals and hashcode to be dependent only on start cell and end cell.

    @Override
        public boolean equals(Object o) {
            if (this == o) return true;
            if (o == null || getClass() != o.getClass()) return false;
    
            Cell cell = (Cell) o;
    
            if (a != cell.a) return false;
            if (b != cell.b) return false;
    
            return true;
        }
    
        @Override
        public int hashCode() {
            int result = a;
            result = 31 * result + b;
            return result;
        }
    

    My solution looks like this:

    Map<Route, Long> routesCounted = routes.stream()
                .sorted((r1,r2)-> (int)(r2.lastUpdated - r1.lastUpdated))
                .collect(Collectors.groupingBy(gr -> gr, Collectors.counting()));
    

    Of course casting to int should be replaced with something more appropriated.

    0 讨论(0)
  • 2020-11-28 15:31

    Here's one approach. First group into lists and then process the lists into the values you actually want:

    import static java.util.Comparator.comparingLong;
    import static java.util.stream.Collectors.groupingBy;
    import static java.util.stream.Collectors.toMap;
    
    
    Map<Route,Integer> routeCounts = routes.stream()
            .collect(groupingBy(x -> x))
            .values().stream()
            .collect(toMap(
                lst -> lst.stream().max(comparingLong(Route::getLastUpdated)).get(),
                List::size
            ));
    
    0 讨论(0)
  • 2020-11-28 15:38

    You can define an abstract "library" method which combines two collectors into one:

    static <T, A1, A2, R1, R2, R> Collector<T, ?, R> pairing(Collector<T, A1, R1> c1, 
            Collector<T, A2, R2> c2, BiFunction<R1, R2, R> finisher) {
        EnumSet<Characteristics> c = EnumSet.noneOf(Characteristics.class);
        c.addAll(c1.characteristics());
        c.retainAll(c2.characteristics());
        c.remove(Characteristics.IDENTITY_FINISH);
        return Collector.of(() -> new Object[] {c1.supplier().get(), c2.supplier().get()},
                (acc, v) -> {
                    c1.accumulator().accept((A1)acc[0], v);
                    c2.accumulator().accept((A2)acc[1], v);
                },
                (acc1, acc2) -> {
                    acc1[0] = c1.combiner().apply((A1)acc1[0], (A1)acc2[0]);
                    acc1[1] = c2.combiner().apply((A2)acc1[1], (A2)acc2[1]);
                    return acc1;
                },
                acc -> {
                    R1 r1 = c1.finisher().apply((A1)acc[0]);
                    R2 r2 = c2.finisher().apply((A2)acc[1]);
                    return finisher.apply(r1, r2);
                }, c.toArray(new Characteristics[c.size()]));
    }
    

    After that the actual operation may look like this:

    Map<Route, Long> result = routes.stream()
            .collect(Collectors.groupingBy(Function.identity(),
                pairing(Collectors.maxBy(Comparator.comparingLong(Route::getLastUpdated)), 
                        Collectors.counting(), 
                        (route, count) -> new AbstractMap.SimpleEntry<>(route.get(), count))
                ))
            .values().stream().collect(Collectors.toMap(e -> e.getKey(), e -> e.getValue()));
    

    Update: such collector is available in my StreamEx library: MoreCollectors.pairing(). Also similar collector is implemented in jOOL library, so you can use Tuple.collectors instead of pairing.

    0 讨论(0)
  • 2020-11-28 15:46

    In principle it seems like this ought to be doable in one pass. The usual wrinkle is that this requires an ad-hoc tuple or pair, in this case with a Route and a count. Since Java lacks these, we end up using an Object array of length 2 (as shown in Tagir Valeev's answer), or AbstractMap.SimpleImmutableEntry, or a hypothetical Pair<A,B> class.

    The alternative is to write a little value class that holds a Route and a count. Of course there's some pain in doing this, but in this case I think it pays off because it provides a place to put the combining logic. That in turn simplifies the stream operation.

    Here's the value class containing a Route and a count:

    class RouteCount {
        final Route route;
        final long count;
    
        private RouteCount(Route r, long c) {
            this.route = r;
            count = c;
        }
    
        public static RouteCount fromRoute(Route r) {
            return new RouteCount(r, 1L);
        }
    
        public static RouteCount combine(RouteCount rc1, RouteCount rc2) {
            Route recent;
            if (rc1.route.getLastUpdated() > rc2.route.getLastUpdated()) {
                recent = rc1.route;
            } else {
                recent = rc2.route;
            }
            return new RouteCount(recent, rc1.count + rc2.count);
        }
    }
    

    Pretty straightforward, but notice the combine method. It combines two RouteCount values by choosing the Route that's been updated more recently and using the sum of the counts. Now that we have this value class, we can write a one-pass stream to get the result we want:

        Map<Route, RouteCount> counted = routes.stream()
            .collect(groupingBy(route -> route,
                        collectingAndThen(
                            mapping(RouteCount::fromRoute, reducing(RouteCount::combine)),
                            Optional::get)));
    

    Like other answers, this groups the routes into equivalence classes based on the starting and ending cell. The actual Route instance used as the key isn't significant; it's just a representative of its class. The value will be a single RouteCount that contains the Route instance that has been updated most recently, along with the count of equivalent Route instances.

    The way this works is that each Route instance that has the same start and end cells is then fed into the downstream collector of groupingBy. This mapping collector maps the Route instance into a RouteCount instance, then passes it to a reducing collector that reduces the instances using the combining logic described above. The and-then portion of collectingAndThen extracts the value from the Optional<RouteCount> that the reducing collector produces.

    (Normally a bare get is dangerous, but we don't get to this collector at all unless there's at least one value available. So get is safe in this case.)

    0 讨论(0)
提交回复
热议问题