loading xml document fails with special character »

后端 未结 2 1674
无人及你
无人及你 2021-01-23 00:34

I\'m consuming an RSS feed and the document contains a special character »

I\'m guessing the feed is not encoded properly but I can\'t change that. I\'d like to o

2条回答
  •  春和景丽
    2021-01-23 01:11

    +1 what Frédéric said. You can also serve » as a raw unescaped character, presumably encoded in UTF-8.

    If it's someone else's RSS feed, you need to kick them to stop producing malformed XML; no XML parser will read this.

    In a element, the HTML content should normally be XML-escaped. So if the description of the item is This is a really interesting article, it should appear in the XML as:

    This is a <em>really</em> interesting article
    

    Consequently, an HTML-encoded » character should have come out as

    &raquo;
    

    If it was included directly from an HTML source without being escaped, that's a more serious XML-injection problem.

    (This is assuming RSS 2.0. In the various earlier versions of RSS, whether the contained HTML or plain text varied from spec to spec and was sometimes completely unspecified. For old RSS versions it's not really reliable to use HTML content at all.)

提交回复
热议问题