I want to retrieve data from HTML document. I am scraping data from a web site I almost done but get issue when tried to retrieve data from the table. Here is HTML code
I prefer using the dynamic type and the DomElement property, but you must be using .net 4+.
For tables, the main advantage here is that you don't have to loop through everything. If you know the row and column that you are looking for, then you can just target the important data by row and column numbers instead of looping through the whole table.
The other big advantage is that you can basically use the entire DOM, reading more than just the contents of the table. Make sure you use lowercase properties as required in javascript, even though you are in c#.
HtmlElement myTableElement;
//Set myTableElement using any GetElement... method.
//Use a loop or square bracket index if the method returns an HtmlElementCollection.
dynamic myTable = myTableElement.DomElement;
for (int i = 0; i < myTable.rows.length; i++)
{
for (int j = 0; j < myTable.rows[i].cells.length; j++)
{
string CellContents = myTable.rows[i].cells[j].innerText;
//You are not limited to innerText; you have the whole DOM available.
//Do something with the CellContents.
}
}