How to get the value from a specific cell C# Html-Agility-Pack

你。 提交于 2019-12-11 04:33:35

问题


How do I get a value from a specific location in the second table in the document. I need the value from the second cell down and third column over in the html document below. How do I do this.

<html>
<head>
<title>Tables</title>
</head>
<body>
<table border="1">
  <tr>
    <th>Room</th>
    <th>Location</th>
  </tr>
  <tr>
    <td>Paint</td>
    <td>A4</td>
  </tr>
  <tr>
    <td>Stock</td>
    <td>B3</td>
  </tr>
  <tr>
    <td>Assy</td>
    <td>N9</td>
  </tr>
</table>
<p></p>
<table border="1">
  <tr>
    <th>Product</th>
    <th>Mat'l</th>
    <th>Weight</th>
    <th>Size</th>
  </tr>
  <tr>
    <td>Cover</td>
    <td>Plastic</td>
    <td>4</td>
    <td>16</td>
  </tr>
  <tr>
    <td>Retainer</td>
    <td>Steel</td>
    <td>12</td>
    <td>8</td>
  </tr>
  <tr>
    <td>Pin</td>
    <td>Bronze</td>
    <td>18</td>
    <td>7</td>
  </tr>
</table>
<p></p>
<table border="1">
  <tr>
    <th>Process</th>
    <th>Location</th>
    <th>Number</th>
  </tr>
  <tr>
    <td>Trim</td>
    <td>S2</td>
    <td>8</td>
  </tr>
  <tr>
    <td>Finish</td>
    <td>D2</td>
    <td>3</td>
  </tr>
</table>
</body>
</html>

Thanks!

Also... Please help a newbie out!!! Please direct me to a resource that can help me understand the syntax of Html-Agility-Pack (HAP). I have the CHM file for HAP - I've tried to use it and I've tried to use VS's object browser for HAP, but it's too cryptic for me at this point.


回答1:


Html Agility Pack is equipped with an XPATH evaluator that follows .NET XPATH syntax over the parsed HTML nodes. Note the XPATH expression used with this library require elements and attribute names to be lowercase, independently from the original HTML source.

So in your case, you can get the cell for the 3rd column, 2nd row, 2nd table with an expression like this:

HtmlDocument doc = new HtmlDocument();
doc.Load(YouTestHtmlFilePath);

HtmlNode node = doc.DocumentNode.SelectSingleNode("//table[2]/tr[2]/td[3]");
Console.WriteLine(node.InnerText); // will output "4"

//table means get any TABLE element recursively from root. [2] means take the 2nd table.

/tr means get any TR element from this current table. [2] means take the 2nd row.

/td means get any TD element from this current row. [3] means take the 3nd cell.

You can find good XPATH tutorials here: XPath Tutorial



来源:https://stackoverflow.com/questions/16474659/how-to-get-the-value-from-a-specific-cell-c-sharp-html-agility-pack

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!