How to remove duplicate objects in a List without equals/hashcode?

后端 未结 21 1571
渐次进展
渐次进展 2020-12-03 02:53

I have to remove duplicated objects in a List. It is a List from the object Blog that looks like this:

public class Blog {
    private String title;
    priv         


        
相关标签:
21条回答
  • 2020-12-03 03:31

    If for some reasons you don't want to override the equals method and you want to remove duplicates based on multiple properties, then we can create a generic method to do that.

    We can write it in 2 versions:

    1. Modify the original list:

    @SafeVarargs
    public static <T> void removeDuplicatesFromList(List<T> list, Function<T, ?>... keyFunctions) {
    
        Set<List<?>> set = new HashSet<>();
    
        ListIterator<T> iter = list.listIterator();
        while(iter.hasNext()) {
            T element = iter.next();
    
            List<?> functionResults = Arrays.stream(keyFunctions)
                    .map(function -> function.apply(element))
                    .collect(Collectors.toList());
    
            if(!set.add(functionResults)) {
                iter.remove();
            }
        }
    }
    

    2. Return a new list:

    @SafeVarargs
    public static <T> List<T> getListWithoutDuplicates(List<T> list, Function<T, ?>... keyFunctions) {
    
        List<T> result = new ArrayList<>();
    
        Set<List<?>> set = new HashSet<>();
    
        for(T element : list) {
            List<?> functionResults = Arrays.stream(keyFunctions)
                    .map(function -> function.apply(element))
                    .collect(Collectors.toList());
    
            if(set.add(functionResults)) {
                result.add(element);
            }
        }
    
        return result;
    }
    

    In both cases we can consider any number of properties.

    For example, to remove duplicates based on 4 properties title, author, url and description:

    removeDuplicatesFromList(blogs, Blog::getTitle, Blog::getAuthor, Blog::getUrl, Blog::getDescription);
    

    The methods work by leveraging the equals method of List, which will check the equality of its elements. In our case the elements of functionResults are the values retrieved from the passed getters and we can use that list as an element of the Set to check for duplicates.

    Complete example:

    public class Duplicates {
    
        public static void main(String[] args) {
    
            List<Blog> blogs = new ArrayList<>();
            blogs.add(new Blog("a", "a", "a", "a"));
            blogs.add(new Blog("b", "b", "b", "b"));
            blogs.add(new Blog("a", "a", "a", "a"));    // duplicate
            blogs.add(new Blog("a", "a", "b", "b"));
            blogs.add(new Blog("a", "b", "b", "b"));
            blogs.add(new Blog("a", "a", "b", "b"));    // duplicate
    
            List<Blog> blogsWithoutDuplicates = getListWithoutDuplicates(blogs, 
                    Blog::getTitle, Blog::getAuthor, Blog::getUrl, Blog::getDescription);
            System.out.println(blogsWithoutDuplicates); // [a a a a, b b b b, a a b b, a b b b]
            
            removeDuplicatesFromList(blogs, 
                    Blog::getTitle, Blog::getAuthor, Blog::getUrl, Blog::getDescription);
            System.out.println(blogs);                  // [a a a a, b b b b, a a b b, a b b b]
        }
    
        private static class Blog {
            private String title;
            private String author;
            private String url;
            private String description;
    
            public Blog(String title, String author, String url, String description) {
                this.title = title;
                this.author = author;
                this.url = url;
                this.description = description;
            }
    
            public String getTitle() {
                return title;
            }
    
            public String getAuthor() {
                return author;
            }
    
            public String getUrl() {
                return url;
            }
    
            public String getDescription() {
                return description;
            }
    
            @Override
            public String toString() {
                return String.join(" ", title, author, url, description);
            }
        }
    }
    
    0 讨论(0)
  • 2020-12-03 03:33

    If you can't edit the source of the class (why not?), then you need to iterate over the list and compare each item based on the four criteria mentioned ("title, author, url and description").

    To do this in a performant way, I would create a new class, something like BlogKey which contains those four elements and which properly implements equals() and hashCode(). You can then iterate over the original list, constructing a BlogKey for each and adding to a HashMap:

    Map<BlogKey, Blog> map = new HashMap<BlogKey, Blog>();
    for (Blog blog : blogs) {
         BlogKey key = createKey(blog);
         if (!map.containsKey(key)) {
              map.put(key, blog);
         }
    }
    Collection<Blog> uniqueBlogs = map.values();
    

    However the far simplest thing is to just edit the original source code of Blog so that it correctly implements equals() and hashCode().

    0 讨论(0)
  • 2020-12-03 03:33

    You can use distinct to remove duplicates

    List<Blog> blogList = ....// add your list here
    
    blogList.stream().distinct().collect(Collectors.toList());
    
    
    0 讨论(0)
提交回复
热议问题