问题
I have text on a website that displays like that: o¨
instead of ö
I extracted the text out of the CMS and analysed it's hex values:
- the ö's that are displays correctly have
c3 b6
- UTF-8 - the ö's that are displayed incorrect have
6f cc 88
I couldn't find out what encoding this is. What's a good way to identify the encoding?
回答1:
6F
is the UTF-8 (ASCII) encoding of "o", nothing spectacular.CC 88
is the UTF-8 encoding of U+0308, COMBINING DIAERESIS.
You're simply looking at the decomposed form of the o-umlaut. A combining diaereses character should visually be rendered, well, combined with the previous character. If your system doesn't do that, it means it doesn't treat Unicode correctly, and/or the font you have chosen is somewhat broken. Perhaps you have to normalise your strings into the composed Unicode form instead for your system to handle it correctly.
来源:https://stackoverflow.com/questions/38303793/how-to-identify-encoding-from-hex-values