HtmlAgilityPack selecting childNodes not as expected

谁说我不能喝 提交于 2019-12-03 10:29:45

问题


I am attempting to use the HtmlAgilityPack library to parse some links in a page, but I am not seeing the results I would expect from the methods. In the following I have a HtmlNodeCollection of links. For each link I want to check if there is an image node and then parse its attribures but the SelectNodes and SelectSingleNode methods of linkNode seems to be searching the parent document not the childNodes of linkNode what gives?

HtmlDocument htmldoc = new HtmlDocument();
htmldoc.LoadHtml(content);
HtmlNodeCollection linkNodes = htmldoc.DocumentNode.SelectNodes("//a[@href]");

foreach(HtmlNode linkNode in linkNodes)
{
    string linkTitle = linkNode.GetAttributeValue("title", string.Empty);
    if (linkTitle == string.Empty)
    {
        HtmlNode imageNode = linkNode.SelectSingleNode("/img[@alt]");     
    }
}

Is there any other way I could get the alt attribute of the image childnode of linkNode if it exists?


回答1:


You should remove the forwardslash prefix from "/img[@alt]" as it signifies that you want to start at the root of the document.

HtmlNode imageNode = linkNode.SelectSingleNode("img[@alt]");



回答2:


With an xpath query you can also use "." to indicate the search should start at the current node.

HtmlNode imageNode = linkNode.SelectSingleNode(".//img[@alt]");



回答3:


Also, Watch out for Null Check. SelectNodes returns null instead of blank collection.

HtmlNodeCollection linkNodes = htmldoc.DocumentNode.SelectNodes("//a[@href]");

**if(linkNodes!=null)**
{
   foreach(HtmlNode linkNode in linkNodes)
  {
     string linkTitle = linkNode.GetAttributeValue("title", string.Empty);
     if (linkTitle == string.Empty)
     {
       **HtmlNode imageNode = linkNode.SelectSingleNode("img[@alt]");**   
     }
  }
}


来源:https://stackoverflow.com/questions/857198/htmlagilitypack-selecting-childnodes-not-as-expected

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!