Parsing HTML Table in C#

后端 未结 3 1725
深忆病人
深忆病人 2020-12-02 14:17

I have an html page which contains a table and i want to parse that table in C# windows form

http://www.mufap.com.pk/payout-report.php?tab=01

3条回答
  •  佛祖请我去吃肉
    2020-12-02 14:19

    Late on this, but a way to do what you ask using plain vanilla C# code may be the following

    /// 
    /// parses a table and returns a list containing all the data with columns separated by tabs
    /// e.g.: records = getTable(doc, 0);
    /// 
    /// HtmlDocument to work with
    /// table index (base 0)
    /// list containing the table data
    public List getTableData(HtmlDocument doc, int number)
    {
      HtmlElementCollection tables = doc.GetElementsByTagName("table");
      int idx=0;
      List data = new List();
    
      foreach (HtmlElement tbl in tables)
      {
        if (idx++ == number)
        {
          data = getTableData(tbl);
          break;
        }
      }
      return data;
    }
    
    /// 
    /// parses a table and returns a list containing all the data with columns separated by tabs
    /// e.g.: records = getTable(getElement(doc, "table", "id", "table1"));
    /// 
    /// HtmlElement table to work with
    /// list containing the table data
    public List getTableData(HtmlElement tbl)
    {
      int nrec = 0;
      List data = new List();
      string rowBuff;
    
      HtmlElementCollection rows = tbl.GetElementsByTagName("tr");
      HtmlElementCollection cols;
      foreach (HtmlElement tr in rows)
      {
        cols = tr.GetElementsByTagName("td");
        nrec++;
        rowBuff = nrec.ToString();
        foreach (HtmlElement td in cols)
        {
          rowBuff += "\t" + WebUtility.HtmlDecode(td.InnerText);
        }
        data.Add(rowBuff);
      }
    
      return data;
    }
    

    the above will allow you to extract data from a table either by using the table "index" inside the page (useful for unnamed tables) or by passing the "table" HtmlElement to the function (faster but only useful for named tables); notice that I choose to return a "List" as the result and separating the various columns data using a tab character; you may easily change the code to return the data in whatever other format you prefer

提交回复
热议问题