问题
How do I convert the MS Word quotes and apostrophes to regular quotes and apostrophes characters in Java? What's the unicode number for these characters?
“how are you doing?”
‘howdy’
Since Stack Overflow autofixes them, here's how they appear in an editor
to
"how are you doing?"
'howdy'
回答1:
Going off Thomas's answer, the code is:
return text.replaceAll("[\\u2018\\u2019]", "'")
.replaceAll("[\\u201C\\u201D]", "\"");
回答2:
Here's a very useful link for everyone dealing with Unicode: Unicode codepoint lookup/search tool.
Searching for "quotation mark" gives
‘ (U+2018) LEFT SINGLE QUOTATION MARK
’ (U+2019) RIGHT SINGLE QUOTATION MARK
“ (U+201C) LEFT DOUBLE QUOTATION MARK
” (U+201D) RIGHT DOUBLE QUOTATION MARK
There are several other quote-like symbols that you might consider replacing.
回答3:
Thank to Nick van Esch at C# How to replace Microsoft's Smart Quotes with straight quotation marks?
Here is the code ('\u2019' is ’ in MS Word), it's useful because it covers problematic word characters.
if (buffer.IndexOf('\u2013') > -1) buffer = buffer.Replace('\u2013', '-');
if (buffer.IndexOf('\u2014') > -1) buffer = buffer.Replace('\u2014', '-');
if (buffer.IndexOf('\u2015') > -1) buffer = buffer.Replace('\u2015', '-');
if (buffer.IndexOf('\u2017') > -1) buffer = buffer.Replace('\u2017', '_');
if (buffer.IndexOf('\u2018') > -1) buffer = buffer.Replace('\u2018', '\'');
if (buffer.IndexOf('\u2019') > -1) buffer = buffer.Replace('\u2019', '\'');
if (buffer.IndexOf('\u201a') > -1) buffer = buffer.Replace('\u201a', ',');
if (buffer.IndexOf('\u201b') > -1) buffer = buffer.Replace('\u201b', '\'');
if (buffer.IndexOf('\u201c') > -1) buffer = buffer.Replace('\u201c', '\"');
if (buffer.IndexOf('\u201d') > -1) buffer = buffer.Replace('\u201d', '\"');
if (buffer.IndexOf('\u201e') > -1) buffer = buffer.Replace('\u201e', '\"');
if (buffer.IndexOf('\u2026') > -1) buffer = buffer.Replace("\u2026", "...");
if (buffer.IndexOf('\u2032') > -1) buffer = buffer.Replace('\u2032', '\'');
if (buffer.IndexOf('\u2033') > -1) buffer = buffer.Replace('\u2033', '\"');
来源:https://stackoverflow.com/questions/2826191/converting-ms-word-curly-quotes-and-apostrophes