发表新帖

发表新帖

What's the difference between UTF-8 and UTF-8 without BOM?

前端未结

关注

 21  1805

佛祖请我去吃肉 2020-11-21 05:45

What\'s different between UTF-8 and UTF-8 without a BOM? Which is better?

21条回答

后悔当初 (楼主)

2020-11-21 06:17
From http://en.wikipedia.org/wiki/Byte-order_mark:

The byte order mark (BOM) is a Unicode character used to signal the endianness (byte order) of a text file or stream. Its code point is U+FEFF. BOM use is optional, and, if used, should appear at the start of the text stream. Beyond its specific use as a byte-order indicator, the BOM character may also indicate which of the several Unicode representations the text is encoded in.

Always using a BOM in your file will ensure that it always opens correctly in an editor which supports UTF-8 and BOM.

My real problem with the absence of BOM is the following. Suppose we've got a file which contains:
```
abc
```
Without BOM this opens as ANSI in most editors. So another user of this file opens it and appends some native characters, for example:
```
abg-αβγ
```
Oops... Now the file is still in ANSI and guess what, "αβγ" does not occupy 6 bytes, but 3. This is not UTF-8 and this causes other problems later on in the development chain.
0 讨论(0)

查看其它21个回答
发布评论:

提交评论
- 加载中...

热议问题