Ahh, my daily DOM practice. You should use DOM to parse HTML and regex to parse strings such as html attributes.
Note: I have some basic regexes that could surely be improved upon by some wizards :)
Note #2: Though it might be extra overhead you could use something like curl to thoroughly check if the href is an actual image by sending a HEAD request and looking at the Content-Type, but this would work in 80-90% of cases.
This will be ignored.
this will not be ignored
bah
';
$dom = new DOMDocument();
$dom->loadHTML($content);
$anchors = $dom->getElementsByTagName('a');
$i = $anchors->length-1;
$protocol = '/^http:\/\//';
$ext = '/([\w+]+)\.(?:gif|jpg|jpeg|png)$/';
if ( count($anchors->length) > 0 ) {
while( $i > -1 ) {
$anchor = $anchors->item($i);
if ( $anchor->hasAttribute('href') ) {
$link = $anchor->getAttribute('href');
if (
preg_match ( $protocol , $link ) &&
preg_match ( $ext, $link )
) {
//echo 'replacing this one.';
$image = $dom->createElement('img');
if ( preg_match( $ext, $link, $matches ) ) {
if ( count($matches) ) {
$altName = $matches[1];
$image->setAttribute('alt', $altName);
}
$image->setAttribute('src', $link);
$anchor->parentNode->replaceChild( $image, $anchor );
}
}
}
$i--;
}
}
echo $dom->saveHTML();