I am reading some text files in a Java program and would like to replace some Unicode characters with ASCII approximations. These files will eventually be broken into sente
Each unicode character is assigned a category. There exists two separate categories for quotes:
With these lists, you should be able to handle all quotes appropriately, if you would like to code the regex manually.
Java Character.getType gives you the category of character, for example FINAL_QUOTE_PUNCTUATION.
Now you can get the category of each (punctuation-)character and replace it with an appropriate supplement in ASCII.
You can use the other punctuation categories accordingly. In 'Punctuation, Other' there are some characters, for example PRIME ′, which you may also want to substitute with an apostrophe.