I\'m consuming an RSS feed and the document contains a special character »
I\'m guessing the feed is not encoded properly but I can\'t change that. I\'d like to o
+1 what Frédéric said. You can also serve »
as a raw unescaped character, presumably encoded in UTF-8.
If it's someone else's RSS feed, you need to kick them to stop producing malformed XML; no XML parser will read this.
In a
element, the HTML content should normally be XML-escaped. So if the description of the item is This is a really interesting article
, it should appear in the XML as:
This is a <em>really</em> interesting article
Consequently, an HTML-encoded »
character should have come out as
»
If it was included directly from an HTML source without being escaped, that's a more serious XML-injection problem.
(This is assuming RSS 2.0. In the various earlier versions of RSS, whether the
contained HTML or plain text varied from spec to spec and was sometimes completely unspecified. For old RSS versions it's not really reliable to use HTML content at all.)