How to get all input elements in a form with HtmlAgilityPack without getting a null reference error

帅比萌擦擦* 提交于 2019-11-27 08:37:10

You can do the following:

HtmlNode.ElementsFlags.Remove("form");

HtmlDocument doc = new HtmlDocument();

doc.Load(@"D:\test.html");

HtmlNode secondForm = doc.GetElementbyId("form2");

foreach (HtmlNode node in secondForm.Elements("input"))
{
    HtmlAttribute valueAttribute = node.Attributes["value"];

    if (valueAttribute != null)
    {
        Console.WriteLine(valueAttribute.Value);
    }
}

By default HTML Agility Pack parses forms as empty node because they are allowed to overlap other HTML elements. The first line, (HtmlNode.ElementsFlags.Remove("form");) disables this behavior allowing you to get the input elements inside the second form.

Update: Example of form elements overlap:

<table>
<form>
<!-- Other elements -->
</table>
</form>

The element begins inside a table but is closed outside the table element. This is allowed in the HTML specification and HTML Agility Pack has to deal with it.

Just get them in array:

HtmlNodeCollection resultCollection = doc.DocumentNode.SelectNodes("//*[@type='text']");
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!