What are the special reserved character entities in HTML and in XML?
The information that I have says:
HTML:
&
First, you're comparing a HTML 4.01 specification with an HTML 5 one. HTML5 ties more closely in with XML than HTML 4.01 ever does (that's why we have XHTML), so this answer will stick to HTML 5 and XML.
Your quoted references are all consistent on the following points:
<
should always be represented with <
when not indicating a processing instruction>
should always be represented with >
when not indicating a processing instruction&
should always be represented with &
<![CDATA[ ]]>
(which only applies to XML)I agree 100% with this. You never want the parser to mistake literals for instructions, so it's a solid idea to always encode any non-space (see below) character. Good parsers know that anything contained within <![CDATA[ ]]>
are not instructions, so the encoding is not necessary there.
In practice, I never encode '
or "
unless
<tag>"Yoinks!", he said.</tag>
)Both specifications also agree with this.
So, the only point of contention is the (space). The only mention of it in either specification is when serialization is attempted. When not, you should always use a literal
(space). Unless you are writing your own parser, I don't see the need to be doing any kind of serialization, so this is beside the point.