问题
In my MatchCollection, I get matches of the same thing. Like this:
string text = @"match match match";
Regex R = new Regex("match");
MatchCollection M = R.Matches(text);
How does one remove duplicate matches and is it the fastest way possible?
Assume "duplicate" here means that the match contains the exact same string.
回答1:
Linq
If you are using .Net 3.5 or greater such as 4.7, linq can be used to remove the duplicates of the match.
string data = "abc match match abc";
Console.WriteLine(string.Join(", ",
Regex.Matches(data, @"([^\s]+)")
.OfType<Match>()
.Select (m => m.Groups[0].Value)
.Distinct()
));
// Outputs abc, match
.Net 2 or No Linq
Place it into a hastable then extract the strings:
string data = "abc match match abc";
MatchCollection mc = Regex.Matches(data, @"[^\s]+");
Hashtable hash = new Hashtable();
foreach (Match mt in mc)
{
string foundMatch = mt.ToString();
if (hash.Contains(foundMatch) == false)
hash.Add(foundMatch, string.Empty);
}
// Outputs abc and match.
foreach (DictionaryEntry element in hash)
Console.WriteLine (element.Key);
回答2:
Try
Regex rx = new Regex(@"\b(?<word>\w+)\s+(\k<word>)\b", RegexOptions.Compiled);
string text = @"match match match";
MatchCollection matches = rx.Matches(text);
来源:https://stackoverflow.com/questions/8592806/how-to-remove-duplicate-matches-in-a-matchcollection