I\'m consuming an RSS feed and the document contains a special character »
I\'m guessing the feed is not encoded properly but I can\'t change that. I\'d like to o
+1 what Frédéric said. You can also serve » as a raw unescaped character, presumably encoded in UTF-8.
If it's someone else's RSS feed, you need to kick them to stop producing malformed XML; no XML parser will read this.
In a <description> element, the HTML content should normally be XML-escaped. So if the description of the item is This is a <em>really</em> interesting article, it should appear in the XML as:
<description>This is a <em>really</em> interesting article</description>
Consequently, an HTML-encoded » character should have come out as
&raquo;
If it was included directly from an HTML source without being escaped, that's a more serious XML-injection problem.
(This is assuming RSS 2.0. In the various earlier versions of RSS, whether the <description> contained HTML or plain text varied from spec to spec and was sometimes completely unspecified. For old RSS versions it's not really reliable to use HTML content at all.)
» is an HTML named entity and is not supported in XML. Out of the box, XML only supports &, ', ", > and <.
Use the corresponding numeric entity » (or hexadecimal ») instead.