How would you get an array of Unicode code points from a .NET String?

前端 未结 5 1412
日久生厌
日久生厌 2020-12-09 03:47

I have a list of character range restrictions that I need to check a string against, but the char type in .NET is UTF-16 and therefore some characters become wa

5条回答
  •  南笙
    南笙 (楼主)
    2020-12-09 04:18

    Doesn't seem like it should be much more complicated than this:

    public static IEnumerable Utf32CodePoints( this IEnumerable s )
    {
      bool      useBigEndian = !BitConverter.IsLittleEndian;
      Encoding  utf32        = new UTF32Encoding( useBigEndian , false , true ) ;
      byte[]    octets       = utf32.GetBytes( s ) ;
    
      for ( int i = 0 ; i < octets.Length ; i+=4 )
      {
        int codePoint = BitConverter.ToInt32(octets,i);
        yield return codePoint;
      }
    
    }
    

提交回复
热议问题