C# Regular Expressions with \Uxxxxxxxx characters in the pattern

后端 未结 3 1935
孤城傲影
孤城傲影 2020-12-19 07:21
Regex.IsMatch( \"foo\", \"[\\U00010000-\\U0010FFFF]\" ) 

Throws: System.ArgumentException: parsing \"[-]\" - [x-y] range in reverse order.

3条回答
  •  盖世英雄少女心
    2020-12-19 08:12

    They're surrogate pairs. Look at the values - they're over 65535. A char is only a 16 bit value. How would you expression 65536 in only 16 bits?

    Unfortunately it's not clear from the documentation how (or whether) the regular expression engine in .NET copes with characters which aren't in the basic multilingual plane. (The \uxxxx pattern in the regular expression documentation only covers 0-65535, just like \uxxxx as a C# escape sequence.)

    Is your real regular expression bigger, or are you actually just trying to see if there are any non-BMP characters in there?

提交回复
热议问题