Replacing accented characters php

后端 未结 19 1476
鱼传尺愫
鱼传尺愫 2020-11-22 16:03

I am trying to replace accented characters with the normal replacements. Below is what I am currently doing.

    $string = \"Éric Cantona\";
    $strict = st         


        
19条回答
  •  面向向阳花
    2020-11-22 16:48

    In PHP 5.4 the intl extension provides a new class named Transliterator.

    I believe that's the best way to remove diacritics for two reasons:

    1. Transliterator is based on ICU, so you're using the tables of the ICU library. ICU is a great project, developed over the year to provide comprehensive tables and functionalities. Whatever table you want to write yourself, it will never be as complete as the one from ICU.

    2. In UTF-8, characters could be represented differently. For example, the character ñ could be saved as a single (multi-byte) character, or as the combination of characters ˜ (multibyte) and n. In addition to this, some characters in Unicode are homograph: they look the same while having different codepoints. For this reason it's also important to normalize the string.

    Here's a sample code, taken from an old answer of mine:

    transliterate($e);
        echo $e. ' --> '.$normalized."\n";
    }
    ?>
    

    Result:

    abcd --> abcd
    èe --> ee
    € --> €
    àòùìéëü --> aouieeu
    àòùìéëü --> aouieeu
    tiësto --> tiesto
    

    The first argument for the Transliterator class performs the removal of diacritics as well as the normalization of the string.

提交回复
热议问题