Get data only from html table used preg_match_all in php

前端 未结 2 1463
暖寄归人
暖寄归人 2021-01-03 16:07

I have a html table like this :

string...
2条回答
  •  孤独总比滥情好
    2021-01-03 16:20

    You absolutely do NOT want to parse HTML with Regex.

    There are far too many variations, for one, and more importantly, regex isn't very good with the hierarchal nature of HTML. It's best to use an XML parser or better-yet an HTML-specific parser.

    Whenever I need to scrape HTML, I tend to use the Simple HTML DOM Parser library, which takes an HTML tree and parses it into a traversable PHP object, which you can query something like JQuery.

    
          
data0 data1 data2 data3 data4
data00 data11 data22 data33 data44
data000 data111 data222 data333 data444
EOS; $oHTML = str_get_html($sHtml); $oTRs = $oHTML->find('table tr'); $aData = array(); foreach($oTRs as $oTR) { $aRow = array(); $oTDs = $oTR->find('td'); foreach($oTDs as $oTD) { $aRow[] = trim($oTD->plaintext); } $aData[] = $aRow; } var_dump($aData); ?>

And the output:

array
  0 => 
    array
      0 => string 'data0' (length=5)
      1 => string 'data1' (length=5)
      2 => string 'data2' (length=5)
      3 => string 'data3' (length=5)
      4 => string 'data4' (length=5)
  1 => 
    array
      0 => string 'data00' (length=6)
      1 => string 'data11' (length=6)
      2 => string 'data22' (length=6)
      3 => string 'data33' (length=6)
      4 => string 'data44' (length=6)
  2 => 
    array
      0 => string 'data000' (length=7)
      1 => string 'data111' (length=7)
      2 => string 'data222' (length=7)
      3 => string 'data333' (length=7)
      4 => string 'data444' (length=7)

提交回复
热议问题