Remove non-ascii characters from string

前端 未结 8 1264
遥遥无期
遥遥无期 2020-11-28 03:39

I\'m getting strange characters when pulling data from a website:

Â

How can I remove anything that isn\'t a non-extended ASCII character?

8条回答
  •  遥遥无期
    2020-11-28 04:21

    I also think that the best solution might be to use a regular expression.

    Here's my suggestion:

    function convert_to_normal_text($text) {
    
        $normal_characters = "a-zA-Z0-9\s`~!@#$%^&*()_+-={}|:;<>?,.\/\"\'\\\[\]";
        $normal_text = preg_replace("/[^$normal_characters]/", '', $text);
    
        return $normal_text;
    }
    

    Then you can use it like this:

    $before = 'Some "normal characters": Abc123!+, some ASCII characters: ABC+ŤĎ and some non-ASCII characters: Ąąśćł.';
    $after = convert_to_normal_text($before);
    echo $after;
    

    Displays:

    Some "normal characters": Abc123!+, some ASCII characters: ABC+ and some non-ASCII characters: .
    

提交回复
热议问题