Why does XPath select nodes outside of context node?

若如初见. 提交于 2021-01-20 07:37:08

问题


I'm using XPath with Node.js and I have the following HTML document, where I want to select all article nodes and then in a second step all divs with class "abc":

<html lang="en">
<head>
    <meta charset="UTF-8">
    <title>Test</title>
</head>
<body>
    <article>
        <div>123456</div>
        <div class="abc">Hello0!</div>
    </article>
    <article>
        <div>123456</div>
        <div class="abc">Hello1!</div>
    </article>
    <article>
        <div>123456</div>
        <div class="abc">Hello2!</div>
    </article>
    <article>
        <div>123456</div>
        <div class="abc">Hello3!</div>
    </article>
    <article>
        <div>123456</div>
        <div class="abc">Hello4!</div>
    </article>
    <article>
        <div>123456</div>
        <div class="abc">Hello5!</div>
    </article>
    <article>
        <div>123456</div>
        <div class="abc">Hello6!</div>
    </article>
    <article>
        <div>123456</div>
        <div class="abc">Hello7!</div>
    </article>
    <article>
        <div>123456</div>
        <div class="abc">Hello8!</div>
    </article>
    <article>
        <div>123456</div>
        <div class="abc">Hello9!</div>
    </article>
</body>
</html>

I used following code to select the nodes:

var xpath = require('xpath');
var DOMParser = require('xmldom').DOMParser;

let parser: DOMParser = new DOMParser();
let doc = parser.parseFromString("HTML-document","text/xml");
let nodes: Node[] = xpath.select("//article", doc);
console.log("NODES: ", nodes.length);
let divs: Node[] = xpath.select("//div[@class='abc']", nodes[0]);
console.log("DIVS: ", divs.length);

My problem is, when checking the two console-logs, the first one says "NODES: 10".

So far, I have ten article nodes. However, when I select again on the first of the ten article nodes, the console says "DIVS: 10". So XPath selected all 10 divs out of one article, where I expected just one div.

What am I doing wrong?


回答1:


You should note that // means search wherever on the page starting from root element while .// means search wherever on the page starting from the current node. So if you want to start search from already found article element you need to replace

"//div[@class='abc']"

with

".//div[@class='abc']"

or

"./div[@class='abc']"

as div is the direct child of article




回答2:


Andersson has already provided the correct direct answer to your question (+1), but here is just another option: You can combine your two XPaths into one: This XPath,

//article[0]/div[@class='abc']

will select the same div element as your two step process does.

You can even be more elaborate at any step in the path. This XPath will select the div elements with @class='abc' within article elements with a div child whose string value is 123456:

//article[div='123456']/div[@class='abc']

For the particular XML document shown, the predicate on article selects all articles, but this possibility for refinement exists in general.



来源:https://stackoverflow.com/questions/42399404/why-does-xpath-select-nodes-outside-of-context-node

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!