Parsing HTML Reading Option Tag Content with HtmlAgillityPack

*爱你&永不变心* 提交于 2019-11-28 01:30:28

By default, the <OPTION> tag is treated by Html Agility Pack as "Empty", which means it does not need a closing </OPTION>. In this case, the closing tag is discarded. You can change this behavior using the HtmlNode.ElementFlags collection.

Here is a code that should do what you want:

HtmlDocument doc = new HtmlDocument();
HtmlNode.ElementsFlags.Remove("option");
doc.LoadHtml(yourHtml);

foreach (HtmlNode node in doc.DocumentNode.SelectNodes("//select[@id='onoffaci']//option"))
{
    Console.WriteLine("Value=" + node.Attributes["value"].Value);
    Console.WriteLine("InnerText=" + node.InnerText);
    Console.WriteLine();
}

Your XPath expression:

//option

It's an absolute path: it traverse all the tree starting from the root.

You need a relative XPath expression:

descendant::option

Or the shorthand

.//option

Do note: this is the only case where to start a path with . (self::node() shorthand) is useful.

You should use:

selectNode.SelectNodes("option");

instead of:

selectNode.SelectNodes("//option");

or you are starting your XPath expression from the root of the HTML document.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!