What's the difference between UTF-8 and UTF-8 without BOM?

前端 未结 21 1803
佛祖请我去吃肉
佛祖请我去吃肉 2020-11-21 05:45

What\'s different between UTF-8 and UTF-8 without a BOM? Which is better?

21条回答
  •  离开以前
    2020-11-21 06:05

    The other excellent answers already answered that:

    • There is no official difference between UTF-8 and BOM-ed UTF-8
    • A BOM-ed UTF-8 string will start with the three following bytes. EF BB BF
    • Those bytes, if present, must be ignored when extracting the string from the file/stream.

    But, as additional information to this, the BOM for UTF-8 could be a good way to "smell" if a string was encoded in UTF-8... Or it could be a legitimate string in any other encoding...

    For example, the data [EF BB BF 41 42 43] could either be:

    • The legitimate ISO-8859-1 string "ABC"
    • The legitimate UTF-8 string "ABC"

    So while it can be cool to recognize the encoding of a file content by looking at the first bytes, you should not rely on this, as show by the example above

    Encodings should be known, not divined.

提交回复
热议问题