Is it possible to display (convert?) the unicode hex \u0092 to an unicode html entity in .NET?

后端 未结 2 734
情话喂你
情话喂你 2021-01-19 07:22

I have some string that contains the following code/value:

\"You won\\u0092t find a ....\"

It looks like that string contains the Right

2条回答
  •  甜味超标
    2021-01-19 08:01

    It looks like there's an encoding mix-up. In .NET, strings are normally encoded as UTF-16, and a right apostrophe should be represented as \u2019. But in your example, the right apostrophe is represented as \x92, which suggests the original encoding was Windows code page 1252. If you include your string in a Unicode document, the character \x92 won't be interpreted properly.

    You can fix the problem by re-encoding your string as UTF-16. To do so, treat the string as an array of bytes, and then convert the bytes back to Unicode using the 1252 code page:

    string title = "You won\u0092t find a cheaper apartment * Sauna & Spa";
    byte[] bytes = title.Select(c => (byte)c).ToArray();
    title = Encoding.GetEncoding(1252).GetString(bytes);
    // Result: "You won’t find a cheaper apartment * Sauna & Spa"
    

提交回复
热议问题