问题
This is a pretty common question, but I could not find this part:
Say I have this array list:
List<MyDataClass> arrayList = new List<MyDataClass>;
MyDataClass{
String name;
String age;
}
Now, I need to find duplicates on the basis of age in MyDataClass and remove them. How is it possible using something like HashSet as described here?
I guess, we will need to overwrite equals in MyDataClass?
- But, what if I do not have the luxury of doing that?
- And How does HashSet actually internally find and does not add duplicates? I saw it's implementation here in OpenJDK but couldn't understand.
回答1:
I'd suggest that you override both equals and hashCode (HashSet relies on both!)
To remove the duplicates you could simply create a new HashSet with the ArrayList as argument, and then clear the ArrayList and put back the elements stored in the HashSet.
class MyDataClass {
String name;
String age;
@Override
public int hashCode() {
return name.hashCode() ^ age.hashCode();
}
@Override
public boolean equals(Object obj) {
if (!(obj instanceof MyDataClass))
return false;
MyDataClass mdc = (MyDataClass) obj;
return mdc.name.equals(name) && mdc.age.equals(age);
}
}
And then do
List<MyDataClass> arrayList = new ArrayList<MyDataClass>();
Set<MyDataClass> uniqueElements = new HashSet<MyDataClass>(arrayList);
arrayList.clear();
arrayList.addAll(uniqueElements);
But, what if I do not have the luxury of doing that?
Then I'd suggest you do some sort of decorator-class that does provide these methods.
class MyDataClassDecorator {
MyDataClass mdc;
public MyDataClassDecorator(MyDataClass mdc) {
this.mdc = mdc;
}
@Override
public int hashCode() {
return mdc.name.hashCode() ^ mdc.age.hashCode();
}
@Override
public boolean equals(Object obj) {
if (!(obj instanceof MyDataClassDecorator))
return false;
MyDataClassDecorator mdcd = (MyDataClassDecorator) obj;
return mdcd.mdc.name.equals(mdc.name) && mdcd.mdc.age.equals(mdc.age);
}
}
回答2:
please see this article that explains the importance of equals() and hashCode to HashSets
Also, see this previously answered question
回答3:
And if you are not able to override "MyDataClass"'s hashCode and equals methods you could write a wrapper class that handles this.
回答4:
public Set<Object> findDuplicates(List<Object> list) {
Set<Object> items = new HashSet<Object>();
Set<Object> duplicates = new HashSet<Object>();
for (Object item : list) {
if (items.contains(item)) {
duplicates.add(item);
} else {
items.add(item);
}
}
return duplicates;
}
来源:https://stackoverflow.com/questions/6737212/how-to-find-duplicates-in-an-arraylistobject