How can I use php to strip all/any attributes from a tag, say a paragraph tag?
>
to
I honestly think that the only sane way to do this is to use a tag and attribute whitelist with the HTML Purifier library. Example script here:
set('HTML.Allowed', 'p,b,a[href],i,br,img[src]');
$config->set('URI.Base', 'http://www.example.com');
$config->set('URI.MakeAbsolute', true);
$purifier = new HTMLPurifier($config);
$dirty_html = "
broken a href linky
c
Anzahl besuchter Seiten
missing end tag
ende
";
$clean_html = $purifier->purify($dirty_html);
print "dirty
";
print "" . htmlentities($dirty_html) . "
";
print "clean
";
print "" . htmlentities($clean_html) . "
";
?>
This yields the following clean, standards-conforming HTML fragment:
broken a href linkfnord
y
c
Anzahl besuchter Seiten
missing end tag
ende
In your case the whitelist would be:
$config->set('HTML.Allowed', 'p');