问题
I have two List<T>
objects (where T
is the same type for both objects), and I need to be able to determine whether they contain the same set of values, even if the values aren't in the same order.
Do the objects have any built-in mechanisms to accomplish this, or do I need to write my own algorithm?
Or perhaps, should I be using a different type of collection, rather than List<T>
?
If I were to write my own algorithm, it would probably consist of the following steps - I'll try to optimize this in the final version, if I go this route:
- Do the two collections contain the same number of values? If not return false.
- Count the number of times each value appears in each collection, return false if the counts aren't equal.
- If I reach the end of both collections without any inequality in value counts, return true.
I know there are some caveats to this, such as the fact that T has to be comparable - I'm using the default comparison for now (e.g. .Equals()
) with appropriate constraints defined for the generic type.
回答1:
Based on available information, I suspect the most efficient solution that supports duplicates is to
- Compare the size of the two lists. If unequal, return false. If equal,
- Sort both lists. If you must retain ordering of the original lists, sort a copy of the lists instead.
- Compare elements at each position of the sorted lists, returning false if the value at a given position is unequal. Return true if you have compared all elements without finding a difference.
Note that I have assumed that sufficient memory is available for the duration of this operation to create a sorted duplicate of the lists (should order preservation be a requirement).
回答2:
So we'll start out with just a simple SetEquals
, and go from there. HashSet
already has an implementation of such a method that can compare two sets for equality, so we can just create a wrapper around that so that we can use it with any type of sequence:
public static bool SetEquals<T>(this IEnumerable<T> first, IEnumerable<T> second,
IEqualityComparer<T> comparer = null)
{
return new HashSet<T>(second, comparer ?? EqualityComparer<T>.Default)
.SetEquals(first);
}
Next, to account for the fact that you have a bag, not a set, we can just take the two sequences that you have, group them, and project that out into a pair that has the item along with the count of matching items. If we do that for both sets then we can compare these sequences of objects as sets, and see if they're set equal. If the key-count pair sequences are both set-equal, then the original sequences are bag-equal:
public static bool BagEquals<T>(
this IEnumerable<T> first,
IEnumerable<T> second)
{
Func<IEnumerable<T>, IEnumerable<KeyValuePair<T, int>>> groupItems =
sequence => sequence.GroupBy(item => item,
(key, items) => new KeyValuePair<T, int>(key, items.Count()));
return groupItems(first)
.SetEquals(groupItems(second));
}
回答3:
Here is a reimplementation of CollectionAssert.AreEquivalent (reference code was decompiled with DotPeek) however instead of throwing a exception it returns a bool.
public class CollectionMethods
{
public static bool AreEquivalent(ICollection expected, ICollection actual)
{
//We can do a few quick tests we can do to get a easy true or easy false.
//Is one collection null and one not?
if (Object.ReferenceEquals(expected, null) != Object.ReferenceEquals(actual, null))
return false;
//Do they both point at the same object?
if (Object.ReferenceEquals(expected, actual))
return true;
//Do they have diffrent counts?
if (expected.Count != actual.Count)
return false;
//Do we have two empty collections?
if (expected.Count == 0)
return true;
//Ran out of easy tests, now have to do the slow work.
int nullCount1;
Dictionary<object, int> elementCounts1 = CollectionMethods.GetElementCounts(expected, out nullCount1);
int nullCount2;
Dictionary<object, int> elementCounts2 = CollectionMethods.GetElementCounts(actual, out nullCount2);
//One last quick check, do the two collections have the same number of null elements?
if (nullCount2 != nullCount1)
{
return false;
}
//Check for each element and see if we see them the same number of times in both collections.
foreach (KeyValuePair<object,int> kvp in elementCounts1)
{
int expectedCount = kvp.Value;
int actualCount;
elementCounts2.TryGetValue(key, out actualCount);
if (expectedCount != actualCount)
{
return false;
}
}
return true;
}
private static Dictionary<object, int> GetElementCounts(ICollection collection, out int nullCount)
{
Dictionary<object, int> dictionary = new Dictionary<object, int>();
nullCount = 0;
foreach (object key in (IEnumerable)collection)
{
if (key == null)
{
++nullCount;
}
else
{
int num;
dictionary.TryGetValue(key, out num);
++num;
dictionary[key] = num;
}
}
return dictionary;
}
}
来源:https://stackoverflow.com/questions/33245613/whats-the-best-way-to-determine-whether-two-listt-objects-contain-the-same-se