Allowing full html to be parsed in HTMLPurifier

筅森魡賤 提交于 2019-12-10 10:55:53

问题


This is a problem I've had for a long time - I currently accept a full html page from the user as input and want to filter / clean it out. the problem with HTMLpurifier is that it removes the head , html , and body tags - as well as the styles in the head. I've google , looked at the forums , tried implementing what was written , and to no luck. Can someone help ?

What I want : To keep the HTML , HEAD , STYLE , BODY TAGS

What I have done :

$config->set('HTML.DefinitionID', 'test');
    $config->set('HTML.DefinitionRev', 1);
    $config->set('HTML.AllowedElements', array('html','head', 'body', 'style', 'div', 'p'));    

    if ($def = $config->maybeGetRawHTMLDefinition()) {
        $def->addElement('html', 'Block', 'Inline', 'Common', array());
        $def->addElement('head', 'Block', 'Inline', 'Common', array());
        $def->addElement('style', 'Block', 'Inline', 'Common', array());
        $def->addElement('body', 'Block', 'Inline', 'Common', array());

    }

回答1:


Why not use strip_tags? It supports list of allowed tags.

http://www.php.net/manual/en/function.strip-tags.php




回答2:


You need to

$config->set('Core.ConvertDocumentToFragment', false);

For whatever reason, Core.ConvertDocumentToFragment defaults to true, even though the documentation states that "for most inputs, this processing is not necessary".

I was bitten by this too. All I got from the error collector was the cryptic message "Removed document metadata tags", which in turn is a translation from the internal message "Lexer: Extracted body".




回答3:


End Result - HTMLPurfier does not natively allow full HTML Parsing - Either extend it or find a pass thru



来源:https://stackoverflow.com/questions/24037016/allowing-full-html-to-be-parsed-in-htmlpurifier

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!