With regex how do i match between an XML tag multiple times?

后端 未结 5 1592
再見小時候
再見小時候 2020-12-21 11:41

First, before you say anything, i HAVE to do this because the RSS is malformed, but i can\'t correct it on my end. So, while I tried using an RSS and a XML parser, they fail

5条回答
  •  无人及你
    2020-12-21 12:20

    The RSS you posted is well-formed XML, but not valid RSS (according to the W3C feed validator). Since it's well-formed your best bet is still to use an XML parser, not to use a regex. In fact, most RSS parsers should be ok too, as RSS is kind of notorious for having validation issues (partly due to poor specifications early on), so any RSS parser worth using shouldn't have any trouble with the kinds of validation problems the W3C validator is reporting.

    As an aside, that looks like a Google News feed. You can get valid Atom by changing the output parameter from "rss" to "atom". eg:

    http://news.google.com/news?pz=1&cf=all&ned=us&hl=en&topic=h&num=3&output=atom
    

    Google's services that generate feeds generally do a better job at producing Atom rather than RSS. That said, you may also want to report the invalid RSS to Google.

提交回复
热议问题