Strip out HTML and Special Characters

后端 未结 9 1285
别跟我提以往
别跟我提以往 2020-12-07 12:39

I\'d like to use any php function or whatever so that i can remove any HTML code and special characters and gives me only alpha-numeric output

$des = "He         


        
相关标签:
9条回答
  • 2020-12-07 13:04

    Here's a function I've been using that I've put together from various threads around the net that removes everything, all tags and leaves you with a perfect phrase. Does anyone know how to modify this script to allow periods (.) ? In other words, leave everything 'as is' but leave the periods alone or other punctuation like and ! or a comma? let me know.

    function stripAlpha( $item )
    
    {
    
        $search     = array( 
             '@<script[^>]*?>.*?</script>@si'   // Strip out javascript 
            ,'@<style[^>]*?>.*?</style>@siU'    // Strip style tags properly 
            ,'@<[\/\!]*?[^<>]*?>@si'            // Strip out HTML tags
            ,'@<![\s\S]*?–[ \t\n\r]*>@'         // Strip multi-line comments including CDATA
            ,'/\s{2,}/'
            ,'/(\s){2,}/'
    
        );
    
        $pattern    = array(
    
             '#[^a-zA-Z ]#'                     // Non alpha characters
            ,'/\s+/'                            // More than one whitespace
    
        );
    
        $replace    = array(
             ''
            ,' '
    
        );
    
        $item = preg_replace( $search, '', html_entity_decode( $item ) );
        $item = trim( preg_replace( $pattern, $replace, strip_tags( $item ) ) );
        return $item;
    
    }
    
    0 讨论(0)
  • 2020-12-07 13:07

    You can do it in one single line :) specially useful for GET or POST requests

    $clear = preg_replace('/[^A-Za-z0-9\-]/', '', urldecode($_GET['id']));
    
    0 讨论(0)
  • 2020-12-07 13:12

    preg_replace('/[^a-zA-Z0-9\s]/', '',$string) this is using for removing special character only rather than space between the strings.

    0 讨论(0)
  • 2020-12-07 13:13

    Probably better here for a regex replace

    // Strip HTML Tags
    $clear = strip_tags($des);
    // Clean up things like &amp;
    $clear = html_entity_decode($clear);
    // Strip out any url-encoded stuff
    $clear = urldecode($clear);
    // Replace non-AlNum characters with space
    $clear = preg_replace('/[^A-Za-z0-9]/', ' ', $clear);
    // Replace Multiple spaces with single space
    $clear = preg_replace('/ +/', ' ', $clear);
    // Trim the string of leading/trailing space
    $clear = trim($clear);
    

    Or, in one go

    $clear = trim(preg_replace('/ +/', ' ', preg_replace('/[^A-Za-z0-9 ]/', ' ', urldecode(html_entity_decode(strip_tags($des))))));
    
    0 讨论(0)
  • 2020-12-07 13:13

    Strip out tags, leave only alphanumeric characters and space:

    $clear = preg_replace('/[^a-zA-Z0-9\s]/', '', strip_tags($des));
    

    Edit: all credit to DaveRandom for the perfect solution...

    $clear = preg_replace('/[^a-zA-Z0-9\s]/', '', strip_tags(html_entity_decode($des)));
    
    0 讨论(0)
  • 2020-12-07 13:19

    All the other solutions are creepy because they are from someone that arrogantly simply thinks that English is the only language in the world :)

    All those solutions strip also diacritics like ç or à.

    The perfect solution, as stated in PHP documentation, is simply:

    $clear = strip_tags($des);
    
    0 讨论(0)
提交回复
热议问题