Emacs lisp: Translate characters to standard ASCII transcription

老子叫甜甜 提交于 2020-01-24 08:45:09

问题


I am trying to write a function, that translates a string containing unicode characters into some default ASCII transcription. Ideally I'd like e.g. Ångström to become Angstroem or, if that is not possible, Angstrom. Likewise α=χ should become a=x (c?) or similar.

Does Emacs have such built-in capabilities? I know I can get the names and similar of characters (get-char-code-property) but I know no built-in transcription table.

The purpose is to translate titles of entries into meaningfully readable filenames, avoiding problems with software that doesn't understand unicode.

My current strategy is to build a translation-table by hand, but this approach is fairly limited and requires a lot of maintenance.


回答1:


There is no built-in capability that i know of. I wrote a package unidecode specifically for your task. It uses the same approach as in Python's same-named library. To install just add MELPA repository to your repository list:

(add-to-list 'package-archives
  '("melpa" . "http://melpa.milkbox.net/packages/") t)

Then run M-x package-install RET unidecode. unidecode has 2 functions, unidecode-unidecode that turns Unicode into ASCII, and unidecode-sanitize that discards non-alphanumeric characters and transforms space into hyphen.

ELISP> (unidecode-unidecode "¡Hola!, Grüß Gott, Hyvää päivää, Tere õhtust, Bonġu Cześć!, Dobrý den, Здравствуйте!, Γειά σας, გამარჯობა")
"!Hola!, Gruss Gott, Hyvaa paivaa, Tere ohtust, Bongu Czesc!, Dobry den, Zdravstvuite!, Geia sas, lmsllmlllmckhmslmgll"
ELISP> (unidecode-sanitize "¡Hola!, Grüß Gott, Hyvää päivää, Tere õhtust, Bonġu Cześć!, Dobrý den, Здравствуйте!, Γειά σας, გამარჯობა")
"hola-gruss-gott-hyvaa-paivaa-tere-ohtust-bongu-czesc-dobry-den-zdravstvuite-geia-sas-lmsllmlllmckhmslmgll"


来源:https://stackoverflow.com/questions/17195972/emacs-lisp-translate-characters-to-standard-ascii-transcription

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!