问题
I have used file_get_contents() to basically get the source code of a site into a single string variable.
The source contains many rows that looks like this:
<td align="center"><a href="somewebsite.com/something">12345</a></td>
(and a lot of rows that don't look like that). I want to extract all the idnumbers (12345 above) and put them in an array. How can I do that? I assume I want to use some kind of regular expressions and then use the preg_match_all() function, but I'm not sure how...
回答1:
Try this:
preg_match('/>[0-9]+<\/a><\/td>/', $str, $matches);
for($i = 0;$i<sizeof($matches);$i++)
$values[] = $matches[$i];
回答2:
Don't mess with regular expressions. Get the variable and let a DOM library do the mundane tasks for you. Take a look at: http://sourceforge.net/projects/simplehtmldom/
Then you can traverse your HTMl like a tree and extract stuff. If you really want to get funky, read up on xPath.
来源:https://stackoverflow.com/questions/5735737/extract-data-php-string