PHP DOMDocument : How to parse custom XML/RSS tag names with COLONS?

安稳与你 提交于 2021-01-01 07:10:25

问题


I have the below RSS to parse, something like:

<?xml version="1.0" encoding="utf-8"?>
<rss xmlns:x-wr="http://www.w3.org/2002/12/cal/prod/Apple_Comp_628d9d8459c556fa#" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:x-example="http://www.example.com/rss/x-example" xmlns:x-microsoft="http://schemas.microsoft.com/x-microsoft" xmlns:xCal="urn:ietf:params:xml:ns:xcal" version="2.0">
    <channel>
        <item>
            <title>About Apples</title>
            <author>David K. Lowie</title>
            <description>Some description about apples</description>
            <xCal:description>This is the full description about apples</xCal:description>
        </item>
        <item>
            <title>About Oranges</title>
            <author>Marry L. Jones</title>
            <description>Some description about oranges</description>
            <xCal:description>This is the full description about oranges</xCal:description>
        </item>
    </channel>
</rss>

In PHP, i parse it something like:

$rss = new DOMDocument();
$rss->load( "http://www.example.com/books.rss" );

foreach( $rss->getElementsByTagName("item") as $node ) {
    echo $node->getElementsByTagName("title")->item(0)->nodeValue,
    echo $node->getElementsByTagName("author")->item(0)->nodeValue,
    echo $node->getElementsByTagName("description")->item(0)->nodeValue,
    echo $node->getElementsByTagName("xCal:description")->item(0)->nodeValue,
}

I can read everything except the xCal:description node there. (The node names are exactly like that: description and the xCal:description.)

  1. How to parse (read) the nodes like xCal:description
  2. Is it because of the similar node names, like: description and the xCal:description ?

(I can't change the RSS source since it's not under my control.)

Please kindly help.


回答1:


Use getElementsByTagNameNS():

$node->getElementsByTagNameNS("urn:ietf:params:xml:ns:xcal", "description")->item(0)->nodeValue



回答2:


While using the namespace aware variants of the DOM methods is a correct answer, you might want to take a look at Xpath. It is a much more comfortable way to fetch data from a DOM.

For the Xpath expression, you can register own prefixes for the namespaces as needed.

$rss = new DOMDocument();
$rss->load("http://www.example.com/books.rss");
$xpath = new DOMXpath($rss);
$xpath->registerNamespace('xc', 'urn:ietf:params:xml:ns:xcal');

foreach($xpath->evaluate("//item") as $item) {
    echo $xpath->evaluate('string(title)', $item), "\n";
    echo $xpath->evaluate('string(author)', $item), "\n";
    echo $xpath->evaluate('string(description)', $item), "\n";
    echo $xpath->evaluate('string(xc:description)', $item), "\n";
}

Output:

About Apples
David K. Lowie
Some description about apples
This is the full description about apples
About Oranges
Marry L. Jones
Some description about oranges
This is the full description about oranges


来源:https://stackoverflow.com/questions/38095199/php-domdocument-how-to-parse-custom-xml-rss-tag-names-with-colons

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!