Java Streams - Get a “symmetric difference list” from two other lists

徘徊边缘 提交于 2019-12-07 03:25:19

问题


Im trying to use Java 8 streams to combine lists. How can I get a "symmetric difference list" (all object that only exist in one list) from two existing lists. I know how to get an intersect list and also how to get a union list.

In the code below I want the disjoint Cars from the two lists of cars (bigCarList,smallCarList). I expect the result to be a list with the 2 cars ("Toyota Corolla" and "Ford Focus")

Example code:

public void testDisjointLists() {
    List<Car> bigCarList = get5DefaultCars();
    List<Car> smallCarList = get3DefaultCars();

    //Get cars that exists in both lists
    List<Car> intersect = bigCarList.stream().filter(smallCarList::contains).collect(Collectors.toList());

    //Get all cars in both list as one list
    List<Car> union = Stream.concat(bigCarList.stream(), smallCarList.stream()).distinct().collect(Collectors.toList());

    //Get all cars that only exist in one list
    //List<Car> disjoint = ???

}

public List<Car> get5DefaultCars() {
    List<Car> cars = get3DefaultCars();
    cars.add(new Car("Toyota Corolla", 2008));
    cars.add(new Car("Ford Focus", 2010));
    return cars;
}

public List<Car> get3DefaultCars() {
    List<Car> cars = new ArrayList<>();
    cars.add(new Car("Volvo V70", 1990));
    cars.add(new Car("BMW I3", 1999));
    cars.add(new Car("Audi A3", 2005));
    return cars;
}

class Car {
    private int releaseYear;
    private String name;
    public Car(String name) {
        this.name = name;
    }
    public Car(String name, int releaseYear) {
        this.name = name;
        this.releaseYear = releaseYear;
    }

    //Overridden equals() and hashCode()
}

回答1:


Based on your own code, there is a straight-forward solution:

List<Car> disjoint = Stream.concat(
    bigCarList.stream().filter(c->!smallCarList.contains(c)),
    smallCarList.stream().filter(c->!bigCarList.contains(c))
).collect(Collectors.toList());

Just filter one list for all items not contained in the other and vice versa and concatenate both results. That works fairly well for small lists and before consider optimized solutions like hashing or making the result distinct() you should ask yourself why you are using lists if you don’t want neither, duplicates nor a specific order.

It seems like you actually want Sets, not Lists. If you use Sets, Tagir Valeev’s solution is appropriate. But it is not working with the actual semantics of Lists, i.e. doesn’t work if the source lists contain duplicates.


But if you are using Sets, the code can be even simpler:

Set<Car> disjoint = Stream.concat(bigCarSet.stream(), smallCarSet.stream())
  .collect(Collectors.toMap(Function.identity(), t->true, (a,b)->null))
  .keySet();

This uses the toMap collector which creates a Map (the value is irrelevant, we simply map to true here) and uses a merge function to handle duplicates. Since for two sets, duplicates can only occur when an item is contained in both sets, these are the items we want remove.

The documentation of Collectors.toMap says that the merge function is treated “as supplied to Map.merge(Object, Object, BiFunction)” and we can learn from there, that simply mapping the duplicate pair to null will remove the entry.

So afterwards, the keySet() of the map contains the disjoint set.




回答2:


Something like this may work:

Stream.concat(bigCarList.stream(), smallCarList.stream())
      .collect(groupingBy(Function.identity(), counting()))
      .entrySet().stream()
      .filter(e -> e.getValue().equals(1L))
      .map(e -> e.getKey())
      .collect(toList());

Here we first collect all the cars to the Map<Car, Long> where value is the number of such cars encountered. After that we filter this map leaving only cars which encoutered exactly once, drop the counts and collect to the final List.




回答3:


A little bit math

disjoint = A and B are disjoint if their intersect is empty.

A disjoint is not a set, it is an indicator showing if two sets are disjoint or not. From your description I think you where searching the symmetric difference.

Symmetric Difference

But anyhow, if you only want to collect to new Lists then all you need is a collector.

I made a method that creates an Collector. This Collector only "collects" values, where the predicate is evaluated to true. So if you are searching for the symmetric difference, than you only need a predicate.

  public void testDisjointLists() {
    List<Car> bigCarList = get5DefaultCars();
    List<Car> smallCarList = get3DefaultCars();

    Collector<Car, ArrayList<Car>, ArrayList<Car>> inter
        = produceCollector(car -> {
          return bigCarList.contains(car) && smallCarList.contains(car);
        });

    Collector<Car, ArrayList<Car>, ArrayList<Car>> symDiff
        = produceCollector(car -> {
          return bigCarList.contains(car) ^ smallCarList.contains(car);
        });

    //Get all cars in both list as one list
    List<Car> union
        = Stream.concat(bigCarList.stream(), smallCarList.stream()).distinct().collect(Collectors.toList());

    List<Car> intersect = union.stream().collect(inter);

    //Get all cars that only exist not exists in both Lists
    List<Car> symmetricDifference = union.stream().collect(symDiff);

    System.out.println("Union Cars:");
    union.stream().forEach(car -> System.out.println("Car: " + car));
    System.out.println("");

    System.out.println("Intersect Cars: ");
    intersect.stream().forEach(car -> System.out.println("Car: " + car));
    System.out.println("");

    System.out.println("Symmetric Difference: ");
    symmetricDifference.stream().forEach(car -> System.out.println("Car: " + car));
    System.out.println("");
  }

  public Collector<Car, ArrayList<Car>, ArrayList<Car>> produceCollector(Predicate<Car> predicate) {
    Collector<Car, ArrayList<Car>, ArrayList<Car>> collector = Collector.of(
        ArrayList::new,
        (al, car) -> {
          if (predicate.test(car)) {
            al.add(car);
          }
        },
        (al1, al2) -> {
          al1.addAll(al2);
          return al1;
        }
    );
    return collector;
  }

For performance freaks

After doing some research, it seems that the collector is about 14 times faster than a first filter solution.

long before2 = System.nanoTime();
List<Car> intersect2 = union.stream().filter(car -> {
  return bigCarList.contains(car) && smallCarList.contains(car);
}).collect(Collectors.toList());
long after2 = System.nanoTime();
System.out.println("Time for first filter solution: " + (after2 - before2));


long before = System.nanoTime();
List<Car> intersect = union.stream().collect(inter);
long after = System.nanoTime();
System.out.println("Time for collector solution: " + (after - before));

Time for first filter solution: 540906

Time for collector solution: 37543




回答4:


What I was seeking was the symmetric difference of the two lists (I have changed the question): Why I used Lists instead of Set was simply because I got 2 lists into my method, otherwise a set would be more suitable.

The solution is what "holger" gave me above. Thanks.

List<Car> disjoint = Stream.concat(
bigCarList.stream().filter(c->!smallCarList.contains(c)),
smallCarList.stream().filter(c->!bigCarList.contains(c))

).collect(Collectors.toList());

This list actually gets the two cars Toyota and Ford that only exist in either of the list (I tried it with two lists with unique cars and the result was right).

Thanks for all help.



来源:https://stackoverflow.com/questions/31074510/java-streams-get-a-symmetric-difference-list-from-two-other-lists

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!