How can I change extended latin characters to their unaccented ASCII equivalents?

前端 未结 5 1520
萌比男神i
萌比男神i 2021-01-18 09:50

I need a generic transliteration or substitution regex that will map extended latin characters to similar looking ASCII characters, and all other extended characters to \'\'

5条回答
  •  萌比男神i
    2021-01-18 10:43

    Maybe a CPAN module might be of help?

    Text::Unidecode looks promising, though it does not strip ‡ or Ω or ‰. Rather these are replaced by ++, O and %o. This might or might not be what you want.

    Text::Unaccent, is another candidate but only for the part of getting rid of the accents.

提交回复
热议问题