问题
I am using VS2010 and using HTMLAGilityPack1.4.6 (from Net40-folder). Following is my HTML
<html>
<body>
<div id="header">
<h2 id="hd1">
Patient Name
</h2>
</div>
</body>
</html>
I am using following code in C# to access "hd1". Please tell me correct way to do it.
HtmlAgilityPack.HtmlDocument htmlDoc = new HtmlAgilityPack.HtmlDocument();
try
{
string filePath = "E:\\file1.htm";
htmlDoc.LoadHtml(filePath);
if (htmlDoc.DocumentNode != null)
{
HtmlNodeCollection _hdPatient = htmlDoc.DocumentNode.SelectNodes("//h2[@id=hd1]");
// htmlDoc.DocumentNode.SelectNodes("//h2[@id='hd1']");
//_hdPatient.InnerHtml = "Patient SurName";
}
}
catch (Exception ex)
{
throw ex;
}
Tried many permutations and combinations... I get null.
plz help.
回答1:
Your problem is the way how you load data into HtmlDocument
. In order to load data from file you should use Load(fileName)
method. But you are using LoadHtml(htmlString)
method, which treats "E:\\file1.htm"
as document content. When HtmlAgilityPack tries to find h2
tags in E:\\file1.htm
string, it finds nothing. Here is the correct way to load html file:
string filePath = "E:\\file1.htm";
htmlDoc.Load(filePath); // use instead of LoadHtml
Also @Simon Mourier correctly pointed that you should use SelectSingleNode
method if you are trying to get single node:
// Single HtmlNode
var patient = doc.DocumentNode.SelectSingleNode("//h2[@id='hd1'");
patient.InnerHtml = "Patient SurName";
Or if you are working with collection of nodes, then process them in a loop:
// Collection of nodes
var patients = doc.DocumentNode.SelectNodes("//div[@class='patient'");
foreach (var patient in patients)
patient.SetAttributeValue("style", "visibility: hidden");
回答2:
You were almost there:
HtmlNode _hdPatient = htmlDoc.DocumentNode.SelectSingleNode("//h2[@id='hd1']");
_hdPatient.InnerHtml = "Patient SurName"
来源:https://stackoverflow.com/questions/17042835/selecting-node-does-not-work-using-htmlagilitypack