How to “remove diacritics” from UTF8 characters in PHP?

99封情书 提交于 2020-04-13 06:51:35

问题


I need to replicate the behavior of MySQL utf8_general_ci collation in PHP. Strictly speaking I need to detect what whould be considered different and what would be considered the same. The case independent part is easy. The problem is utf_general_ci considers characters with diacritics and characters without diacritics to be equal: e = è = é etc.. To replicate that comparison, I'd need to have a way to replace è -> e, é -> e.

The method that comes to my mind is:

echo iconv("utf-8", "ascii//TRANSLIT", "é");

One problem is iconv behaves differently depending on current locale and that's asking for a problem.

The other problem is the input may also contain Cirillic letters that shouldn't be stripped or result in a PHP Notice.

echo iconv("utf-8", "ascii//TRANSLIT", "дом");

Is there a solution or do I have to create manually mapping of each character with diacritic to a one without it?


回答1:


intl's Transliterator will let you define far more in-depth transliteration rules. The full documentation on transliteration rules can be found on icu-project.org.

$tests = [ "é", "дом" ];

$tl = Transliterator::create('Latin-ASCII;');
foreach($tests as $str) {
    var_dump(
        $tl->transliterate($str)
    );
}

Output:

string(1) "e"
string(6) "дом"



回答2:


The goal is to 'prevent collisions values already present in the table'? And there are accented letters that should be allowed to coexist with different accents and non-accents? Then change the collation of the PRIMARY (or UNIQUE) key that is causing the collisions.

Any ..._bin COLLATION will allow e and é to coexist (not collide during insertion) because it treats them as different.

Do you need ...general_ci for some other reason? If so, please state the reason. If not, ALTER TABLE to change the COLLATION. I see no need for PHP code.



来源:https://stackoverflow.com/questions/48588705/how-to-remove-diacritics-from-utf8-characters-in-php

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!