How can I escape all code within <code></code> tags to allow people to post code?

一个人想着一个人 提交于 2019-12-13 15:29:31

问题


What I want to do is to allow users to post code if they need to, so it is viewable and it doesn't render. For example:

<span>
<div id="hkhsdfhu"></div>
</span>
<h1>Hello</h1>

Should be turned into:

&lt;span&gt;
&lt;div id="hkhsdfhu"&gt;&lt;/div&gt;
&lt;/span&gt;
&lt;h1&gt;Hello&lt;/h1&gt;

Only if it is wrapped in <code></code> tags. Right now I am using the following function to allow only certain HTML tags and escape any other tags:

function allowedHtml($str) {
$allowed_tags = array("b", "strong", "i", "em");
$sans_tags = str_replace(array("<", ">"), array("&lt;","&gt;"), $str);
$regex = sprintf("~&lt;(/)?(%s)&gt;~", implode("|",$allowed_tags));
$with_allowed = preg_replace($regex, "<\\1\\2>", $sans_tags);
return $with_allowed;
}

However, if a user wraps their code in <code></code> tags and it contains any of the allowed tags in my function above, those tags will render instead of being escaped. How can I make it where anything in <code></code> tags gets escaped (or just the < and > turned into &lt; and &gt;)? I know about htmlentities() but I don't want to do that to the whole post, only stuff inside <code></code> tags.

Thanks in advance!


回答1:


Just use a single preg_replace() function with the e modifier to execute an htmlenteties() function on everything it finds within <code> tags

EDITED

function allowedHtml($str) {
  $str = htmlentities($str, ENT_QUOTES, "UTF-8");
  $allowed_tags = array("b", "strong", "i", "em", "code");
  foreach ($allowed_tags as $tag) {
    $str = preg_replace("#&lt;" . $tag . "&gt;(.*?)&lt;/" . $tag . "&gt;#i", "<" . $tag . ">$1</" . $tag . ">", $str);
  }
  return $str;
}

$reply = allowedHtml($_POST['reply']);
$reply = preg_replace("#\<code\>(.+?)\</code\>#e", "'<code>'.htmlentities('$1', ENT_QUOTES, 'UTF-8').'</code>'", $reply);
$reply = str_replace("&amp;", "&", $reply);

Rewrote your allowedHtml() function and added a str_replace() at the end.

It's tested and should now work perfectly :)

UPDATED - NEW SOLUTION

function convertHtml($reply, $revert = false) {
  $specials = array("**", "*", "_", "-");
  $tags = array("b", "i", "u", "s");

  foreach ($tags as $key => $tag) {
    $open = "<" . $tag . ">";
    $close = "</" . $tag . ">";

    if ($revert == true) {
      $special = $specials[$key];
      $reply = preg_replace("#" . $open . "(.+?)" . $close . "#i", $special . "$1" . $special, $reply);
    }
    else {
      $special = str_replace("*", "\*", $specials[$key]);
      $reply = preg_replace("#" . $special . "(.+?)" . $special . "#i", $open . "$1" . $close, $reply);
    }
  }

  return $reply;
}

$reply = htmlentities($reply, ENT_QUOTES, "UTF-8");
$reply = convertHtml($reply);

$reply = preg_replace("#[^\S\r\n]{4}(.+?)(?!.+)#i", "<pre><code>$1</code></pre>", $reply);
$reply = preg_replace("#\</code\>\</pre\>(\s*)\<pre\>\<code\>#i", "$1", $reply);

$reply = nl2br($reply);
$reply = preg_replace("#\<pre\>\<code\>(.*?)\</code\>\</pre\>#se", "'<pre><code>'.convertHtml(str_replace('<br />', '', '$1'), true).'</code></pre>'", $reply);

Discussed another solution, and the above code will fix that. It works just like the Stack Overflow html conversion, which means that ** becomes bold, * becomes italic, _ becomes underlined and - is "strikethrough". On top of that, all lines starting with 4 or more spaces will be output as code




回答2:


I think you would be better off working directly with the dom rather than using regular expressions to parse out allowed tags. For example to traverse the dom and escape content in <code> tags, you could do something along the lines of:

$doc = new DOMDocument();
$doc->loadHTML($postHtml);
$codeNode = $doc->getElementsByTagName('code')->item(0);
$escapedCode = htmlspecialchars($codeNode->nodeValue);



回答3:


Here is a way you can do it with preg_replace(). Just make sure you call this function before you call your allowedHtml function so the tags are already replaced.

<?php

$post = <<<EOD
I am a person writing a post
How can I write this code?

Example:

<code>
<span>
<div id="hkhsdfhu"></div>
</span>
<h1>Hello</h1>
</code>

Pls help me...
EOD;

$post = preg_replace('/<code>(.*?)<\/code>/ise',
                     "'<code>' . htmlspecialchars('$1') . '</code>'",
                      $post);

var_dump($post);

Result:

string(201) "I am a person writing a post
How can I write this code?

Example:

<code>
&lt;span&gt;
&lt;div id=\&quot;hkhsdfhu\&quot;&gt;&lt;/div&gt;
&lt;/span&gt;
&lt;h1&gt;Hello&lt;/h1&gt;
</code>

Pls help me..."



回答4:


Here's one.

$str = preg_replace_callback('/(?<=<code>)(.*?)(?=<\/code>)/si','escape_code',$str);

function escape_code($matches) {

    $tags = array('b','strong','i','em');
    // declare the tags in this array

    $allowed = implode('|',$tags);
    $match = htmlentities($matches[0],ENT_NOQUOTES,'UTF-8');
    return preg_replace('~&lt;(/)?('.$allowed.')(\s*/)?&gt;~i','<$1$2$3>',$match);
}


来源:https://stackoverflow.com/questions/9509447/how-can-i-escape-all-code-within-code-code-tags-to-allow-people-to-post-code

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!