How do I translate 8bit characters into 7bit characters? (i.e. Ü to U)

前端 未结 15 2030
旧时难觅i
旧时难觅i 2020-12-05 10:29

I\'m looking for pseudocode, or sample code, to convert higher bit ascii characters (like, Ü which is extended ascii 154) into U (which is ascii 85).

My initial gues

15条回答
  •  情歌与酒
    2020-12-05 11:06

    Indeed as proposed by unexist : "iconv" function exists to handle all weird conversion for you, is available in almost all programming language and has a special option which tries to convert characters missing in the target set with approximations.

    Use iconv to simply convert your input UTF-8 string to 7bit ASCII.

    Otherwise, you'll always end hitting corner case : a 8bit input using a different codepage with a different set of characters (thus not working at all with your conversion table), forgot to map one last stupid accented caracter (you mapped all grave/acute accent, but forgot to map Czech caron or the nordic '°'), etc.

    Of course if you want to apply the solution to a small specific problem (making file-system friendly filenames for your music collection) the the look-up arrays are the way to go (either an array which for each code number above 128 maps an approximation under 128 as proposed by JeeBee, or the source/target pairs proposed by vIceBerg depending on which substitution functions are already available in your language of choice), because it's quickly hacked together and quickly check for missing elements.

提交回复
热议问题