How to replace Unicode characters with ASCII

前端 未结 4 854
被撕碎了的回忆
被撕碎了的回忆 2021-02-15 18:35

I have the following command to replace Unicode characters with ASCII ones.

sed -i \'s/Ã/A/g\'

The problem is à isn\'t recognized

4条回答
  •  耶瑟儿~
    2021-02-15 19:04

    There is also uconv, from ICU.

    Examples:

    • uconv -x "::NFD; [:Nonspacing Mark:] > ; ::NFC;": to remove accents
    • uconv -x "::Latin; ::Latin-ASCII;": for a transliteration latin/ascii
    • uconv -x "::Latin; ::Latin-ASCII; ([^\x00-\x7F]) > ;": for a transliteration latin/ascii and removal of remaining code points > 0x7F
    • ...

    echo "À l'école ☠" | uconv -x "::Latin; ::Latin-ASCII; ([^\x00-\x7F]) > ;" gives: A l'ecole

提交回复
热议问题