问题
My code is like this:
public class CaseAccentInsensitiveEqualityComparer : IEqualityComparer<string>
{
public bool Equals(string x, string y)
{
return string.Compare(x, y, CultureInfo.InvariantCulture, CompareOptions.IgnoreNonSpace | CompareOptions.IgnoreCase) == 0;
}
public int GetHashCode(string obj)
{
// not sure what to put here
}
}
I know the role of GetHashCode
in this context, what I'm missing is how to produce the InvariantCulture
, IgnoreNonSpace
and IgnoreCase
version of obj
so that I can return it's HashCode
.
I could remove diacritics and the case from obj
myself and then return it's hashcode
, but I wonder if there's a better alternative.
回答1:
Returning 0 inside GetHashCode()
works (as pointed out by @Michael Perrenoud) because Dictionaries
and HashMaps
call Equals()
just if GetHashCode()
for two objects return the same values.
The rule is, GetHashCode() must return the same value if objects are equal.
The drawback is that the HashSet
(or Dictionary
) performance decreases to the point it becomes the same as using a List. To find an item it has to call Equals()
for each comparison.
A faster approach would be converting to Accent Insensitive string and getting its hashcode.
Code to remove accent (diacritics) from this post
static string RemoveDiacritics(string text)
{
return string.Concat(
text.Normalize(NormalizationForm.FormD)
.Where(ch => CharUnicodeInfo.GetUnicodeCategory(ch) !=
UnicodeCategory.NonSpacingMark)
).Normalize(NormalizationForm.FormC);
}
Comparer code:
public class CaseAccentInsensitiveEqualityComparer : IEqualityComparer<string>
{
public bool Equals(string x, string y)
{
return string.Compare(x, y, CultureInfo.InvariantCulture, CompareOptions.IgnoreNonSpace | CompareOptions.IgnoreCase) == 0;
}
public int GetHashCode(string obj)
{
return obj != null ? RemoveDiacritics(obj).ToUpperInvariant().GetHashCode() : 0;
}
private string RemoveDiacritics(string text)
{
return string.Concat(
text.Normalize(NormalizationForm.FormD)
.Where(ch => CharUnicodeInfo.GetUnicodeCategory(ch) !=
UnicodeCategory.NonSpacingMark)
).Normalize(NormalizationForm.FormC);
}
}
回答2:
Ah, excuse me, I had my methods mixed up. When I implemented something like this before I just returned the hash code of the object itself return obj.GetHashCode();
so that it would always enter the Equals
method.
Okay, after much confusion I believe I've got myself straight. I found that returning zero, always, will force the comparer to use the Equals
method. I'm looking for the code I implemented this in to prove that and put it up here.
Here's the code to prove it.
class MyArrayComparer : EqualityComparer<object[]>
{
public override bool Equals(object[] x, object[] y)
{
if (x.Length != y.Length) { return false; }
for (int i = 0; i < x.Length; i++)
{
if (!x[i].Equals(y[i]))
{
return false;
}
}
return true;
}
public override int GetHashCode(object[] obj)
{
return 0;
}
}
来源:https://stackoverflow.com/questions/13017660/im-implementing-a-caseaccentinsensitiveequalitycomparer-for-strings-im-not-su