I have to remove duplicated objects in a List. It is a List from the object Blog that looks like this:
public class Blog {
private String title;
priv
It is recommended to override equals()
and hashCode()
to work with hash-based collections, including HashMap
, HashSet
, and Hashtable
, So doing this you can easily remove duplicates by initiating HashSet
object with Blog list.
List<Blog> blogList = getBlogList();
Set<Blog> noDuplication = new HashSet<Blog>(blogList);
But Thanks to Java 8 which have very cleaner version to do this as you mentioned you can not change code to add equals()
and hashCode()
Collection<Blog> uniqueBlogs = getUniqueBlogList(blogList);
private Collection<Blog> getUniqueBlogList(List<Blog> blogList) {
return blogList.stream()
.collect(Collectors.toMap(createUniqueKey(), Function.identity(), (blog1, blog2) -> blog1))
.values();
}
List<Blog> updatedBlogList = new ArrayList<>(uniqueBlogs);
Third parameter of Collectors.toMap()
is merge Function (functional interface) used to resolve collisions between values associated with the same key.
Create a new class that wraps your Blog object and provides the equality/hashcode method you need. For maximum efficiency I would add two static methods on the wrapper, one to convert Blogs list -> Blog Wrapper list and the other to convert Blog Wrapper list -> Blog list. Then you would:
Code for Blog Wrapper would be something like this:
import java.util.ArrayList;
import java.util.List;
public class BlogWrapper {
public static List<Blog> unwrappedList(List<BlogWrapper> blogWrapperList) {
if (blogWrapperList == null)
return new ArrayList<Blog>(0);
List<Blog> blogList = new ArrayList<Blog>(blogWrapperList.size());
for (BlogWrapper bW : blogWrapperList) {
blogList.add(bW.getBlog());
}
return blogList;
}
public static List<BlogWrapper> wrappedList(List<Blog> blogList) {
if (blogList == null)
return new ArrayList<BlogWrapper>(0);
List<BlogWrapper> blogWrapperList = new ArrayList<BlogWrapper>(blogList
.size());
for (Blog b : blogList) {
blogWrapperList.add(new BlogWrapper(b));
}
return blogWrapperList;
}
private Blog blog = null;
public BlogWrapper() {
super();
}
public BlogWrapper(Blog aBlog) {
super();
setBlog(aBlog);
}
public boolean equals(Object other) {
// Your equality logic here
return super.equals(other);
}
public Blog getBlog() {
return blog;
}
public int hashCode() {
// Your hashcode logic here
return super.hashCode();
}
public void setBlog(Blog blog) {
this.blog = blog;
}
}
And you could use this like so:
List<BlogWrapper> myBlogWrappers = BlogWrapper.wrappedList(your blog list here);
Set<BlogWrapper> noDupWrapSet = new HashSet<BlogWrapper>(myBlogWrappers);
List<BlogWrapper> noDupWrapList = new ArrayList<BlogWrapper>(noDupSet);
List<Blog> noDupList = BlogWrapper.unwrappedList(noDupWrapList);
Quite obviously you can make the above code more efficient, particularly by making the wrap and unwrap methods on Blog Wrapper take collections instead of Lists.
An alternative route to wrapping the Blog class would be to use a byte code manipulation library like BCEL to actually change the equals and hashcode method for Blog. But of course, that could have unintended consequences to the rest of your code if they require the original equals/hashcode behaviour.
If your Blog
class has an appropriate equals()
method defined on it, the simplest way is just to create a Set
out of your list, which will automatically remove duplicates:
List<Blog> blogList = ...; // your initial list
Set<Blog> noDups = new HashSet<Blog>(blogList)
The chances are this will work transparently with the rest of your code - if you're just iterating over the contents, for example, then any instance of Collection
is as good as another. (If iteration order matters, then you may prefer a LinkedHashSet
instead, which will preserve the original ordering of the list).
If you really need the result to be a List
then keeping with the straightforward approach, you can just convert it straight back again by wrapping in an ArrayList
(or similar). If your collections are relatively small (less than a thousand elements, say) then the apparent inefficiencies of this approach are likely to be immaterial.
import java.util.ArrayList;
import java.util.HashSet;
class Person
{
public int age;
public String name;
public int hashCode()
{
// System.out.println("In hashcode");
int hashcode = 0;
hashcode = age*20;
hashcode += name.hashCode();
System.out.println("In hashcode : "+hashcode);
return hashcode;
}
public boolean equals(Object obj)
{
if (obj instanceof Person)
{
Person pp = (Person) obj;
boolean flag=(pp.name.equals(this.name) && pp.age == this.age);
System.out.println(pp);
System.out.println(pp.name+" "+this.name);
System.out.println(pp.age+" "+this.age);
System.out.println("In equals : "+flag);
return flag;
}
else
{
System.out.println("In equals : false");
return false;
}
}
public void setAge(int age)
{
this.age=age;
}
public int getAge()
{
return age;
}
public void setName(String name )
{
this.name=name;
}
public String getName()
{
return name;
}
public String toString()
{
return "[ "+name+", "+age+" ]";
}
}
class ListRemoveDuplicateObject
{
public static void main(String[] args)
{
ArrayList<Person> al=new ArrayList();
Person person =new Person();
person.setName("Neelesh");
person.setAge(26);
al.add(person);
person =new Person();
person.setName("Hitesh");
person.setAge(16);
al.add(person);
person =new Person();
person.setName("jyoti");
person.setAge(27);
al.add(person);
person =new Person();
person.setName("Neelesh");
person.setAge(60);
al.add(person);
person =new Person();
person.setName("Hitesh");
person.setAge(16);
al.add(person);
person =new Person();
person.setName("Mohan");
person.setAge(56);
al.add(person);
person =new Person();
person.setName("Hitesh");
person.setAge(16);
al.add(person);
System.out.println(al);
HashSet<Person> al1=new HashSet();
al1.addAll(al);
al.clear();
al.addAll(al1);
System.out.println(al);
}
}
output
[[ Neelesh, 26 ], [ Hitesh, 16 ], [ jyoti, 27 ], [ Neelesh, 60 ], [ Hitesh, 16 ], [ Mohan,56 ], [ Hitesh, 16 ]]
In hashcode : -801018364
In hashcode : -2133141913
In hashcode : 101608849
In hashcode : -801017684
In hashcode : -2133141913
[ Hitesh, 16 ]
Hitesh Hitesh
16 16
In equals : true
In hashcode : 74522099
In hashcode : -2133141913
[ Hitesh, 16 ]
Hitesh Hitesh
16 16
In equals : true
[[ Neelesh, 60 ], [ Neelesh, 26 ], [ Mohan, 56 ], [ jyoti, 27 ], [ Hitesh, 16 ]]
Here is the complete code which works for this scenario:
class Blog {
private String title;
private String author;
private String url;
public String getTitle() {
return title;
}
public void setTitle(String title) {
this.title = title;
}
public String getAuthor() {
return author;
}
public void setAuthor(String author) {
this.author = author;
}
public String getUrl() {
return url;
}
public void setUrl(String url) {
this.url = url;
}
public String getDescription() {
return description;
}
public void setDescription(String description) {
this.description = description;
}
private String description;
Blog(String title, String author, String url, String description)
{
this.title = title;
this.author = author;
this.url = url;
this.description = description;
}
@Override
public boolean equals(Object obj) {
// TODO Auto-generated method stub
if(obj instanceof Blog)
{
Blog temp = (Blog) obj;
if(this.title == temp.title && this.author== temp.author && this.url == temp.url && this.description == temp.description)
return true;
}
return false;
}
@Override
public int hashCode() {
// TODO Auto-generated method stub
return (this.title.hashCode() + this.author.hashCode() + this.url.hashCode() + this.description.hashCode());
}
}
Here is the main function which will eliminate the duplicates:
public static void main(String[] args) {
Blog b1 = new Blog("A", "sam", "a", "desc");
Blog b2 = new Blog("B", "ram", "b", "desc");
Blog b3 = new Blog("C", "cam", "c", "desc");
Blog b4 = new Blog("A", "sam", "a", "desc");
Blog b5 = new Blog("D", "dam", "d", "desc");
List<Blog> list = new ArrayList();
list.add(b1);
list.add(b2);
list.add(b3);
list.add(b4);
list.add(b5);
//Removing Duplicates;
Set<Blog> s= new HashSet<Blog>();
s.addAll(list);
list = new ArrayList<Blog>();
list.addAll(s);
//Now the List has only the identical Elements
}
Here is one way of removing a duplicate object.
The blog class should be something like this or similar, like proper pojo
public class Blog {
private String title;
private String author;
private String url;
private String description;
private int hashCode;
public String getTitle() {
return title;
}
public void setTitle(String title) {
this.title = title;
}
public String getAuthor() {
return author;
}
public void setAuthor(String author) {
this.author = author;
}
public String getUrl() {
return url;
}
public void setUrl(String url) {
this.url = url;
}
public String getDescription() {
return description;
}
public void setDescription(String description) {
this.description = description;
}
@Override
public boolean equals(Object obj) {
Blog blog = (Blog)obj;
if(title.equals(blog.title) &&
author.equals(blog.author) &&
url.equals(blog.url) &&
description.equals(blog.description))
{
hashCode = blog.hashCode;
return true;
}else{
hashCode = super.hashCode();
return false;
}
}
}
And use it like this to remove duplicates objects. The key data structure here is the Set and LinkedHashSet. It will remove duplicates and also keep order of entry
Blog blog1 = new Blog();
blog1.setTitle("Game of Thrones");
blog1.setAuthor("HBO");
blog1.setDescription("The best TV show in the US");
blog1.setUrl("www.hbonow.com/gameofthrones");
Blog blog2 = new Blog();
blog2.setTitle("Game of Thrones");
blog2.setAuthor("HBO");
blog2.setDescription("The best TV show in the US");
blog2.setUrl("www.hbonow.com/gameofthrones");
Blog blog3 = new Blog();
blog3.setTitle("Ray Donovan");
blog3.setAuthor("Showtime");
blog3.setDescription("The second best TV show in the US");
blog3.setUrl("www.showtime.com/raydonovan");
ArrayList<Blog> listOfBlogs = new ArrayList<>();
listOfBlogs.add(blog1);
listOfBlogs.add(blog2);
listOfBlogs.add(blog3);
Set<Blog> setOfBlogs = new LinkedHashSet<>(listOfBlogs);
listOfBlogs.clear();
listOfBlogs.addAll(setOfBlogs);
for(int i=0;i<listOfBlogs.size();i++)
System.out.println(listOfBlogs.get(i).getTitle());
Running this should print
Game of Thrones
Ray Donovan
The second one will be removed because it is a duplicate of the first object.