Regex accent insensitive?

后端 未结 7 1724
天命终不由人
天命终不由人 2020-12-09 17:38

I need a Regex in a C# program.


I\'ve to capture a name of a file with a specific structure.

I used the \\w cha

7条回答
  •  谎友^
    谎友^ (楼主)
    2020-12-09 18:37

    You could simply replace diacritics with alphabetic (near-)equivalences, and then use use your current regex.

    See for example:

    How do I remove diacritics (accents) from a string in .NET?

    static string RemoveDiacritics(string input)
    {
        string normalized = input.Normalize(NormalizationForm.FormD);
        var builder = new StringBuilder();
    
        foreach (char ch in normalized)
        {
            if (CharUnicodeInfo.GetUnicodeCategory(ch) != UnicodeCategory.NonSpacingMark)
            {
                builder.Append(ch);
            }
        }
    
        return builder.ToString().Normalize(NormalizationForm.FormC);
    }
    
    string s1 = "Renato Núñez David DeJesús Edwin Encarnación";
    string s2 = RemoveDiacritics(s1);
    // s2 = "Renato Nunez David DeJesus Edwin Encarnacion"
    

提交回复
热议问题