Prefer charset declaration in HTML meta tag or HTTP header?

前端 未结 2 1959
被撕碎了的回忆
被撕碎了的回忆 2020-12-18 04:45

I\'m parsing a lot of sites. All works fine, I\'m reading also charset declarations to convert encodings. Now I\'ve a problem with http://celleheute.de/sonntagsfuhrung-3/.

2条回答
  •  暖寄归人
    2020-12-18 04:54

    To understand what modern browsers do, you should start reading at http://w3c.github.io/html/syntax.html#determining-the-character-encoding

    Steps one and two are most relevant to the question. They say

    1. If the user has explicitly instructed the user agent to override the document's character encoding with a specific encoding, optionally return that encoding with the confidence certain and abort these steps.

    2. If the transport layer specifies an encoding, and it is supported, return that encoding with the confidence certain, and abort these steps.

    which means that the real HTTP header takes precedence over everything except user over-ride.

    Beyond that it can get complex. A byte order mark, can for example, take precedence over the meta tag.


    UPDATE: Since this answer was written, the spec changed (around mid-2012) so that the byte order mark now takes precedence over the HTTP header.

提交回复
热议问题