Minifying final HTML output using regular expressions with CodeIgniter

后端 未结 3 1948
忘掉有多难
忘掉有多难 2020-12-04 13:56

Google pages suggest you to minify HTML, that is, remove all the unnecessary spaces. CodeIgniter does have the feature of giziping output or it can be done via .htacce

3条回答
  •  失恋的感觉
    2020-12-04 14:14

    I implemented the answer from @ridgerunner in two projects, and ended up hitting some severe slowdowns (10-30 second request times) in staging for one of the projects. I found out that I had to set both pcre.recursion_limit and pcre.backtrack_limit quite low for it to even work, but even then it would give up after about 2 senconds of processing and return false.

    Since that, I've replaced it with this solution (with easier-to-grasp regex), which is inspired by the outputfilter.trimwhitespace function from Smarty 2. It does no backtracking or recursion, and works every time (instead of catastrophically failing once in a blue moon):

    function filterHtml($input) {
        // Remove HTML comments, but not SSI
        $input = preg_replace('//s', '', $input);
    
        // The content inside these tags will be spared:
        $doNotCompressTags = ['script', 'pre', 'textarea'];
        $matches = [];
    
        foreach ($doNotCompressTags as $tag) {
            $regex = "!<{$tag}[^>]*?>.*?!is";
    
            // It is assumed that this placeholder could not appear organically in your
            // output. If it can, you may have an XSS problem.
            $placeholder = "@@<'-placeholder-$tag'>@@";
    
            // Replace all the tags (including their content) with a placeholder, and keep their contents for later.
            $input = preg_replace_callback(
                $regex,
                function ($match) use ($tag, &$matches, $placeholder) {
                    $matches[$tag][] = $match[0];
                    return $placeholder;
                },
                $input
            );
        }
    
        // Remove whitespace (spaces, newlines and tabs)
        $input = trim(preg_replace('/[ \n\t]+/m', ' ', $input));
    
        // Iterate the blocks we replaced with placeholders beforehand, and replace the placeholders
        // with the original content.
        foreach ($matches as $tag => $blocks) {
            $placeholder = "@@<'-placeholder-$tag'>@@";
            $placeholderLength = strlen($placeholder);
            $position = 0;
    
            foreach ($blocks as $block) {
                $position = strpos($input, $placeholder, $position);
                if ($position === false) {
                    throw new \RuntimeException("Found too many placeholders of type $tag in input string");
                }
                $input = substr_replace($input, $block, $position, $placeholderLength);
            }
        }
    
        return $input;
    }
    

提交回复
热议问题