C# Encoding.Converting Latin to Hebrew

前端 未结 1 1863
栀梦
栀梦 2020-12-11 17:48

I\'m trying to fetch and parse an online excel document which is written in hebrew but unfortunately in a non-hebrew encoding.

As an example I\'m trying to convert t

相关标签:
1条回答
  • 2020-12-11 18:44
    const string Str = "âìéåï_1";
    
    Encoding latinEncoding = Encoding.GetEncoding("Windows-1252");
    Encoding hebrewEncoding = Encoding.GetEncoding("Windows-1255");
    
    byte[] latinBytes = latinEncoding.GetBytes(Str);
    
    string hebrewString = hebrewEncoding.GetString(latinBytes);
    

    hebrewString:

    גליון_1

    In your supplied example "Window-1252" is not actualy ASCII, it is extended ASCII, and for some reason Encoding.Convert with these two encodings cannot convert extended range ASCII, so all +127 characters are converted as 63 (i.e. ?). When "converting" from one extended ASCII character byte[] to another, I would expect the bytes to be the same, it is only when you convert them to a .Net unicode string I would expect them to be different. Not sure why Convert is converting +127 chars to '?'.

    0 讨论(0)
提交回复
热议问题