Extract duplicate objects from a List in Java 8

天大地大妈咪最大 提交于 2019-12-03 07:31:46

To indentify duplicates, no method I know of is better suited than Collectors.groupingBy(). This allows you to group the list into a map based on a condition of your choice.

Your condition is a combination of id and firstName. Let's extract this part into an own method in Person:

String uniqueAttributes() {
  return id + firstName;
}

The getDuplicates() method is now quite straightforward:

public static List<Person> getDuplicates(final List<Person> personList) {
  return getDuplicatesMap(personList).values().stream()
      .filter(duplicates -> duplicates.size() > 1)
      .flatMap(Collection::stream)
      .collect(Collectors.toList());
}

private static Map<String, List<Person>> getDuplicatesMap(List<Person> personList) {
  return personList.stream().collect(groupingBy(Person::uniqueAttributes));
}
  • The first line calls another method getDuplicatesMap() to create the map as explained above.
  • It then streams over the values of the map, which are lists of persons.
  • It filters out everything except lists with a size greater than 1, i.e. it finds the duplicates.
  • Finally, flatMap() is used to flatten the stream of lists into one single stream of persons, and collects the stream to a list.

An alternative, if you truly identify persons as equal if the have the same id and firstName is to go with the solution by Jonathan Johx and implement an equals() method.

Deadpool

In this scenario you need to write your custom logic to extract the duplicates from the list, you will get all the duplicates in the Person list

   public static List<Person> extractDuplicates(final List<Person> personList) {

    return personList.stream().flatMap(i -> {
        final AtomicInteger count = new AtomicInteger();
        final List<Person> duplicatedPersons = new ArrayList<>();

        personList.forEach(p -> {

            if (p.getId().equals(i.getId()) && p.getFirstName().equals(i.getFirstName())) {
                count.getAndIncrement();
            }

            if (count.get() == 2) {
                duplicatedPersons.add(i);
            }

        });

        return duplicatedPersons.stream();
    }).collect(Collectors.toList());
}

Applied to:

 List<Person> l = new ArrayList<>();
           Person alex = new 
 Person.Builder().id(1L).firstName("alex").secondName("salgado").build();
            Person lolita = new 
 Person.Builder().id(2L).firstName("lolita").secondName("llanero").build();
            Person elpidio = new 
 Person.Builder().id(3L).firstName("elpidio").secondName("ramirez").build();
            Person romualdo = new 
 Person.Builder().id(4L).firstName("romualdo").secondName("gomez").build();
            Person otroRomualdo = new 
 Person.Builder().id(4L).firstName("romualdo").secondName("perez").build();
      l.add(alex);
      l.add(lolita);
      l.add(elpidio);
      l.add(romualdo);
      l.add(otroRomualdo);

Output:

[Person [id=4, firstName=romualdo, secondName=gomez], Person [id=4, firstName=romualdo, secondName=perez]]
List<Person> duplicates = personList.stream()
  .collect(Collectors.groupingBy(Person::getId))
  .entrySet().stream()
  .filter(e->e.getValue().size() > 1)
  .flatMap(e->e.getValue().stream())
  .collect(Collectors.toList());

That should give you a List of Person where the id has been duplicated.

List<Person> arr = new ArrayList<>();
arr.add(alex);
arr.add(lolita);
arr.add(elpidio);
arr.add(romualdo);
arr.add(otroRomualdo);

Set<String> set = new HashSet<>();
List<Person> result = arr.stream()
                         .filter(data -> (set.add(data.name +";"+ Long.toString(data.id)) == false))
                         .collect(Collectors.toList());
arr.removeAll(result);
Set<String> set2 = new HashSet<>();
result.stream().forEach(data -> set2.add(data.name +";"+ Long.toString(data.id)));
List<Person> resultTwo = arr.stream()
                            .filter(data -> (set2.add(data.name +";"+ Long.toString(data.id)) == false))
                            .collect(Collectors.toList());
result.addAll(resultTwo);

The above code will filter based on name and id. The result List will have all the duplicated Person Object

I think first you should overwrite equals method of Person class and focus on id and name. And after you can update it adding a filter for that.

@Override
public int hashCode() {
    return Objects.hash(id, name);
}

@Override
public boolean equals(Object obj) {
    if (this == obj) {
        return true;
    }
    if (obj == null) {
        return false;
    }
    if (getClass() != obj.getClass()) {
        return false;
    }
    final Person other = (Person) obj;
    if (!Objects.equals(name, other.name)) {
        return false;
    }
    if (!Objects.equals(id, other.id)) {
        return false;
    }
    return true;
}

 personList
       .stream() 
       .filter(p -> personList.contains(p))
       .collect(Collectors.toList());
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!