Why does string.Compare seem to handle accented characters inconsistently?

后端 未结 3 2062
孤城傲影
孤城傲影 2020-12-10 02:52

If I execute the following statement:

string.Compare(\"mun\", \"mün\", true, CultureInfo.InvariantCulture)

The result is \'-1\', indicating

3条回答
  •  半阙折子戏
    2020-12-10 03:28

    It looks like the accented character is only being used in a sort of "tie-break" situation - in other words, if the strings are otherwise equal.

    Here's some sample code to demonstrate:

    using System;
    using System.Globalization;
    
    class Test
    {
        static void Main()
        {
            Compare("mun", "mün");
            Compare("muna", "münb");
            Compare("munb", "müna");
        }
    
        static void Compare(string x, string y)
        {
            int result = string.Compare(x, y, true, 
                                       CultureInfo.InvariantCulture));
    
            Console.WriteLine("{0}; {1}; {2}", x, y, result);
        }
    }
    

    (I've tried adding a space after the "n" as well, to see if it was done on word boundaries - it isn't.)

    Results:

    mun; mün; -1
    muna; münb; -1
    munb; müna; 1
    

    I suspect this is correct by various complicated Unicode rules - but I don't know enough about them.

    As for whether you need to take this into account... I wouldn't expect so. What are you doing that is thrown by this?

提交回复
热议问题