C# string comparison ignoring spaces, carriage return or line breaks

前端 未结 10 933
执念已碎
执念已碎 2020-12-05 13:07

How can I compare 2 strings in C# ignoring the case, spaces and any line-breaks. I also need to check if both strings are null then they are marked as same.

Thanks!<

相关标签:
10条回答
  • 2020-12-05 13:22

    I would probably start by removing the characters you don't want to compare from the string before comparing. If performance is a concern, you might look at storing a version of each string with the characters already removed.

    Alternatively, you could write a compare routine that would skip over the characters you want to ignore. But that just seems like more work to me.

    0 讨论(0)
  • 2020-12-05 13:27

    Remove all the characters you don't want and then use the ToLower() method to ignore case.

    edit: While the above works, it's better to use StringComparison.OrdinalIgnoreCase. Just pass it as the second argument to the Equals method.

    0 讨论(0)
  • 2020-12-05 13:29

    You should normalize each string by removing the characters that you don't want to compare and then you can perform a String.Equals with a StringComparison that ignores case.

    Something like this:

    string s1 = "HeLLo    wOrld!";
    string s2 = "Hello\n    WORLd!";
    
    string normalized1 = Regex.Replace(s1, @"\s", "");
    string normalized2 = Regex.Replace(s2, @"\s", "");
    
    bool stringEquals = String.Equals(
        normalized1, 
        normalized2, 
        StringComparison.OrdinalIgnoreCase);
    
    Console.WriteLine(stringEquals);
    

    Here Regex.Replace is used first to remove all whitespace characters. The special case of both strings being null is not treated here but you can easily handle that case before performing the string normalization.

    0 讨论(0)
  • 2020-12-05 13:31

    You can also use the following custom function

    public static string ExceptChars(this string str, IEnumerable<char> toExclude)
            {
                StringBuilder sb = new StringBuilder();
                for (int i = 0; i < str.Length; i++)
                {
                    char c = str[i];
                    if (!toExclude.Contains(c))
                        sb.Append(c);
                }
                return sb.ToString();
            }
    
            public static bool SpaceCaseInsenstiveComparision(this string stringa, string stringb)
            {
                return (stringa==null&&stringb==null)||stringa.ToLower().ExceptChars(new[] { ' ', '\t', '\n', '\r' }).Equals(stringb.ToLower().ExceptChars(new[] { ' ', '\t', '\n', '\r' }));
            }
    

    And then use it following way

    "Te  st".SpaceCaseInsenstiveComparision("Te st");
    
    0 讨论(0)
  • 2020-12-05 13:32

    Another option is the LINQ SequenceEquals method which according to my tests is more than twice as fast as the Regex approach used in other answers and very easy to read and maintain.

    public static bool Equals_Linq(string s1, string s2)
    {
        return Enumerable.SequenceEqual(
            s1.Where(c => !char.IsWhiteSpace(c)).Select(char.ToUpperInvariant),
            s2.Where(c => !char.IsWhiteSpace(c)).Select(char.ToUpperInvariant));
    }
    
    public static bool Equals_Regex(string s1, string s2)
    {
        return string.Equals(
            Regex.Replace(s1, @"\s", ""),
            Regex.Replace(s2, @"\s", ""),
            StringComparison.OrdinalIgnoreCase);
    }
    

    Here the simple performance test code I used:

    var s1 = "HeLLo    wOrld!";
    var s2 = "Hello\n    WORLd!";
    var watch = Stopwatch.StartNew();
    for (var i = 0; i < 1000000; i++)
    {
        Equals_Linq(s1, s2);
    }
    Console.WriteLine(watch.Elapsed); // ~1.7 seconds
    watch = Stopwatch.StartNew();
    for (var i = 0; i < 1000000; i++)
    {
        Equals_Regex(s1, s2);
    }
    Console.WriteLine(watch.Elapsed); // ~4.6 seconds
    
    0 讨论(0)
  • 2020-12-05 13:35

    This may also work.

    String.Compare(s1, s2, CultureInfo.CurrentCulture, CompareOptions.IgnoreCase | CompareOptions.IgnoreSymbols) == 0
    

    Edit:

    IgnoreSymbols: Indicates that the string comparison must ignore symbols, such as white-space characters, punctuation, currency symbols, the percent sign, mathematical symbols, the ampersand, and so on.

    0 讨论(0)
提交回复
热议问题