Hello I found few and tried few, but nothing really works for me. Best I found was able to extract title of the page, but there are many title tags on the page and it extrac
If it's HTML there should only be 1 tag... but, granted, it could be XML with an XSLT. In which case, instead of mucking about with RegExps to attempt to parse it, it's generally better to create a DOMDocument object and use that instead:
Of course, if the document isn't XML well formed this is going to fall over.
//taken directly from the comments on PHP documentation at :
// http://uk3.php.net/manual/en/domdocument.load.php
// so that you can load in an XML file over HTTP
$opts = array(
'http' => array(
'user_agent' => 'PHP libxml agent',
)
);
$context = stream_context_create($opts);
libxml_set_streams_context($context);
// request a file through HTTP
$xml = DOMDocument::load('http://www.example.com/file.xml');
// added this bit to get the elements
$aTitles = $xml->getElementsByTagName('title');
// loop and output
foreach($aTitles as $oTitle) {
echo "{$oTitle->nodeValue}
\n";
}