I am reading some text files in a Java program and would like to replace some Unicode characters with ASCII approximations. These files will eventually be broken into sente
What I've done for similar substitutions is create a Map
(usually HashMap
) with the Unicode characters as the keys and their substitute as the values.
Pseudo-Java; the for
depends on what sort of character container you're using as a parameter to the method that does this, e.g. String, CharSequence, etc.
StringBuilder output = new StringBuilder();
for (each Character 'c' in inputString)
{
Character replacement = xlateMap.get( c );
output.append( replacement != null ? replacement : c );
}
return output.toString();
Anything in the Map is replaced, anything not in the Map is unchanged and copied to output.