I am reading some text files in a Java program and would like to replace some Unicode characters with ASCII approximations. These files will eventually be broken into sente
Here's a Python package that does a good job. It's based on a Perl module Text::Unidecode. I assume this could be ported to Java.
http://www.tablix.org/~avian/blog/archives/2009/01/unicode_transliteration_in_python/
http://pypi.python.org/pypi/Unidecode