Get a value of an attribute by HtmlAgilityPack

后端 未结 5 1946
佛祖请我去吃肉
佛祖请我去吃肉 2020-12-17 16:28

I want to get a value of an attribute by HtmlAgilityPack. Html code:





        
相关标签:
5条回答
  • 2020-12-17 17:05

    Get a HtmlNode by attribute value:

    public static class Extensions
    {
        public static HtmlNode GetNodeByAttributeValue(this HtmlNode htmlNode, string attributeName, string attributeValue)
        {
            if (htmlNode.Attributes.Contains(attributeName))
            {
                if (string.Compare(htmlNode.Attributes[attributeName].Value, attributeValue, true) == 0)
                {
                    return htmlNode;
                }
            }
    
            foreach (var childHtmlNode in htmlNode.ChildNodes)
            {
                var resultNode = GetNodeByAttributeValue(childHtmlNode, attributeName, attributeValue);
                if (resultNode != null) return resultNode;
            }
    
            return null;
        }
    }
    

    Usage

    var searchResultsDiv = pageDocument.DocumentNode.GetNodeByAttributeValue("someattributename", "resultsofsearch");
    
    0 讨论(0)
  • 2020-12-17 17:08

    Ok, I came to this:

    var link = htmldoc.DocumentNode.SelectSingleNode("//link[@itemprop='thumbnailUrl']");
    var href = link.Attributes["href"].Value;
    
    0 讨论(0)
  • 2020-12-17 17:23

    load the webpage as Htmldocument and directly select the last link tag.

            HtmlWeb web = new HtmlWeb();
            HtmlDocument doc = web.Load(Url);
            var output = doc.DocumentNode.SelectNodes("//link[@href]").LastOrDefault();
            var data = output.Attributes["href"].Value;
    

    or load the webpage as Htmldocument and get the collection of all selected link tags then travel using loop then access last select tag attribute.

            HtmlWeb web = new HtmlWeb();
            HtmlDocument doc = web.Load(Url);
            int count = 0;
            string data = "";
            var output = doc.DocumentNode.SelectNodes("//link[@href]");
    
            foreach (var item in output)
            {
                count++;
                if (count == output.Count)
                {
                    data=item.Attributes["href"].Value;
                    break;
                }
            }
    
    0 讨论(0)
  • 2020-12-17 17:26

    Following XPath selects link elements which have href attribute defined. Then from links you are selecting last one:

    var link = doc.DocumentNode.SelectNodes("//link[@href]").LastOrDefault();
    // you can also check if link is not null
    var href = link.Attributes["href"].Value; // "anotherstyle7.css"
    

    You can also use last() XPath operator

    var link = doc.DocumentNode.SelectSingleNode("/link[@href][last()]");
    var href = link.Attributes["href"].Value;
    

    UPDATE: If you want to get last element which has both itemprop and href attributes, then use XPath //link[@href and @itemprop][last()] or //link[@href and @itemprop] if you'll go with first approach.

    0 讨论(0)
  • 2020-12-17 17:27

    you need something like that:

    HtmlWeb web = new HtmlWeb();
    HtmlAgilityPack.HtmlDocument htmldoc = web.Load(Url);
    htmldoc.OptionFixNestedTags = true;
    var navigator = (HtmlNodeNavigator)htmldoc.CreateNavigator();
    string xpath = "//link[@itemprop]/@href";
    string val = navigator.SelectSingleNode(xpath).Value;
    
    0 讨论(0)
提交回复
热议问题