I\'m wondering whether I can get a consensus on which method is the better approach to creating a distinct set of elements: a C# HashSet
or using IEnumera
For large collections HashSet is likely to be faster. It relies on the hashcode of the objects to quickly determine whether or not an element already exists in the set.
In practice, it (most likely) won't matter (but you should measure if you care).
I instinctively guessed at first that HashSet
would be faster, because of the fast hash checking it uses. However, I looked up the current (4.0) implementation of Distinct in the reference sources, and it uses a similar Set
class (which also relies on hashing) under the covers. Conclusion; there are no practical performance difference.
For your case, I would go with .Distinct
for readability - it clearly conveys the intent of the code. However, I agree with one of the other answers, that you probably should perform this operationn in the DB if possible.