问题
I want to change this sentence :
Et ça sera sa moitié.
To :
Et ca sera sa moitie.
Is there an easy way to do this in Java, like I would do in Objective-C ?
NSString *str = @\"Et ça sera sa moitié.\";
NSData *data = [str dataUsingEncoding:NSASCIIStringEncoding allowLossyConversion:YES];
NSString *newStr = [[NSString alloc] initWithData:data encoding:NSASCIIStringEncoding];
回答1:
Finally, I've solved it by using the Normalizer
class.
import java.text.Normalizer;
public static String stripAccents(String s)
{
s = Normalizer.normalize(s, Normalizer.Form.NFD);
s = s.replaceAll("[\\p{InCombiningDiacriticalMarks}]", "");
return s;
}
回答2:
Maybe the easiest and safest way is using StringUtils
from Apache Commons Lang
StringUtils.stripAccents(String input)
Removes diacritics (~= accents) from a string. The case will not be altered. For instance, 'à' will be replaced by 'a'. Note that ligatures will be left as is.
StringUtils.stripAccents()
回答3:
I guess the only difference is that I use a +
and not a []
compared to the solution. I think both works, but it's better to have it here as well.
String normalized = Normalizer.normalize(input, Normalizer.Form.NFD);
String accentRemoved = normalized.replaceAll("\\p{InCombiningDiacriticalMarks}+", "");
回答4:
Assuming you are using Java 6 or newer, you might want to take a look at Normalizer, which can decompose accents, then use a regex to strip the combining accents.
Otherwise, you should be able to achieve the same result using ICU4J.
回答5:
For kotlin
fun stripAccents(s: String): String
{
var string = Normalizer.normalize(s, Normalizer.Form.NFD)
string = Regex("\\p{InCombiningDiacriticalMarks}+").replace(string, "")
return string
}
回答6:
thank you
public static final Pattern DIACRITICS_AND_FRIENDS = Pattern.compile(
"[\\p{InCombiningDiacriticalMarks}\\p{IsLm}\\p{IsSk}]+");
private static String stripDiacritics(String str) {
str = Normalizer.normalize(str, Normalizer.Form.NFD);
str = DIACRITICS_AND_FRIENDS.matcher(str).replaceAll("");
return str;
}
=> stripDiacritics("Et Ça sera sa moitié." );
来源:https://stackoverflow.com/questions/15190656/easy-way-to-remove-accents-from-a-unicode-string