Hopefully should be a simple question for someone that has done it before!
I have a list of old web documents in table format with lots of contact details in it. Wha
I was looking exactly for it, and worked perfect.
I created a function to extract and save it to HTML
function clean_web_source($web_source) {
$dom = new DOMDocument();
@$dom->loadHTML($web_source);
$xpath = new DOMXPath($dom);
$nodes = $xpath->query('//table[@width="580"]');
$data = array();
foreach ($nodes as $node) {
$tmp_dom = new DOMDocument();
$tmp_dom->appendChild($tmp_dom->importNode($node, true));
$data[] = trim($tmp_dom->saveHTML()); //Before use "saveHTML" I used textContent and print_r($data) to identify the array position that interested me.
}
return $data[2]; //The code in position 2 it's what I want.
}
$url = "http://www.theurl.com/?param=1&lang=1";
$web_source = file_get_contents($url);
$target_source = clean_web_source($web_source); //What I've look for.
Thanks.
I believed you are looking for something like this:
$nodes = $xpath->query('//table/tbody/tr/td[@align="top"] |
//table/tbody/tr/td[@valign="top"]');
$data = array();
foreach ($nodes as $node) {
$data[] = $node->textContent;
}
This would give you:
Array
(
[0] => Indigo Blue 123
[1] => 123 Blue House
[2] =>
[3] =>
[4] => Hanley
[5] =>
[6] => ST13 4SN
[7] => Stoke on Trent
[8] => 01875 322511
[9] =>
[10] => www.indigoblue123.org.uk
)