问题
This is a problem I've had for a long time - I currently accept a full html page from the user as input and want to filter / clean it out. the problem with HTMLpurifier is that it removes the head , html , and body tags - as well as the styles in the head. I've google , looked at the forums , tried implementing what was written , and to no luck. Can someone help ?
What I want : To keep the HTML , HEAD , STYLE , BODY TAGS
What I have done :
$config->set('HTML.DefinitionID', 'test');
$config->set('HTML.DefinitionRev', 1);
$config->set('HTML.AllowedElements', array('html','head', 'body', 'style', 'div', 'p'));
if ($def = $config->maybeGetRawHTMLDefinition()) {
$def->addElement('html', 'Block', 'Inline', 'Common', array());
$def->addElement('head', 'Block', 'Inline', 'Common', array());
$def->addElement('style', 'Block', 'Inline', 'Common', array());
$def->addElement('body', 'Block', 'Inline', 'Common', array());
}
回答1:
Why not use strip_tags? It supports list of allowed tags.
http://www.php.net/manual/en/function.strip-tags.php
回答2:
You need to
$config->set('Core.ConvertDocumentToFragment', false);
For whatever reason, Core.ConvertDocumentToFragment
defaults to true
, even though the documentation states that "for most inputs, this processing is not necessary".
I was bitten by this too. All I got from the error collector was the cryptic message "Removed document metadata tags", which in turn is a translation from the internal message "Lexer: Extracted body".
回答3:
End Result - HTMLPurfier does not natively allow full HTML Parsing - Either extend it or find a pass thru
来源:https://stackoverflow.com/questions/24037016/allowing-full-html-to-be-parsed-in-htmlpurifier