I'm trying to get all the divs that their class contains a certain word:
<div class="hello mike">content1</div>
<div class="hello jeff>content2</div>
<div class="john">content3</div>
I need to get all the divs that their class contains the word "hello". Something like this:
resultContent.DocumentNode.SelectNodes("//div[@class='hello']"))
how can i do it with agility pack?
I got it:
resultContent.DocumentNode.SelectNodes("//div[contains(@class, 'hello')]"))
I'm sure because there're multiple classes in your div, that doesn't work. You can try this instead:
resultContent.DocumentNode.Descendants("div").Where(d => d.Attributes["class"].Value.Contains("hello"));
As I wrote here, as of version v1.6.5 of Html Agility Pack, it contains .HasClass("class-name") extension methoda.
IEnumerable<HtmlNode> nodes =
htmlDoc.DocumentNode.Descendants(0)
.Where(n => n.HasClass("class-name"));
as you have specified that the class has to contain a certain word, the following will ensure that the word is:
- at the start of the string and followed by a space
- or in the middle of the string and surrounded by whitespace
- or at the end of the string and preceded by a space
- or the only class name in the class attribute
It does so by comparing the value of the class attribute surrounded by spaces with the specified word (hello) surrounded by spaces. This is to avoid false positives like class="something-hello-something"
resultContent.DocumentNode.SelectNodes("//div[contains(concat(' ', @class, ' '), ' hello ')]");
HtmlDocument htmlDoc = new HtmlAgilityPack.HtmlDocument();
htmlDoc.Load(filePath);
foreach(HtmlNode link in doc.DocumentElement.SelectNodes("//div[@class='hello']")
{
//code
}
来源:https://stackoverflow.com/questions/36711680/c-sharp-html-agility-pack-get-elements-by-class-name