问题
I am editing some email that got from tesseract ocr.
Here is my code:
if (email != null) {
email = email.replaceAll(" ", "");
email = email.replaceAll("caneer", "career");
email = email.replaceAll("canaer", "career");
email = email.replaceAll("canear", "career");
email = email.replaceAll("caraer", "career");
email = email.replaceAll("carear", "career");
email = email.replace("|", "l");
email = email.replaceAll("}", "j");
email = email.replaceAll("j3b", "job");
email = email.replaceAll("gmaii.com", "gmail.com");
email = email.replaceAll("hotmaii.com", "hotmail.com");
email = email.replaceAll(".c0m", ".com");
email = email.replaceAll(".coin", ".com");
email = email.replaceAll("consuit", "consult");
}
return email;
But the output is not correct.
Input :
amrut=ac.hrworks@g mai|.com
Output :
lalcl.lhlrlwlolrlklsl@lglmlalil|l.lclolml
But when I assigned the result to a new String after every replacement, it works fine. Why continuous assignment in the same String is not working?
回答1:
You'll note in the Javadoc for String.replaceAll() that the first argument is a regular expression.
A period (.
) has a special meaning there as does a pipe (|
) as does a curly brace (}
). You need to escape them all, such as:
email = email.replaceAll("gmaii\\.com", "gmail.com");
回答2:
(Is this Java?)
Note that in Java, replaceAll accepts a regular expression and the dot matches any character. You need to escape the dot or use
somestring.replaceAll(Pattern.quote("gmail.com"), "replacement");
Also note the typo here:
email = emai.replaceAll("canear", "career");
should be
email = email.replaceAll("canear", "career");
回答3:
By realizing that replaceAll()
first argument is regex
you can make your comparisons much less
For example you can check for possible misspellings of the word career
by the following regex
email = email.replaceAll("ca[n|r][e|a][e|a]r", "career"));
回答4:
You have to escape .
by \\.
like following :
if (email != null) {
email = email.replaceAll(" ", "");
email = email.replaceAll("caneer", "career");
email = email.replaceAll("canaer", "career");
email = email.replaceAll("canear", "career");
email = email.replaceAll("caraer", "career");
email = email.replaceAll("carear", "career");
email = email.replace("|", "l");
email = email.replaceAll("}", "j");
email = email.replaceAll("j3b", "job");
email = email.replaceAll("gmaii\\.com", "gmail.com");
email = email.replaceAll("hotmaii\\.com", "hotmail.com");
email = email.replaceAll("\\.c0m", "com");
email = email.replaceAll("\\.coin", "com");
email = email.replaceAll("consuit", "consult");
}
return email;
回答5:
You are using some regex characters.
Please escape them using \
or by using Pattern.quote
method
回答6:
I think you are not aware that first parameter of replaceAll
is regex.
.
, |
, }
might be interpreted in a different way from your expectation.
. Any character (may or may not match line terminators)
http://docs.oracle.com/javase/6/docs/api/java/util/regex/Pattern.html
For space you better use
\s A whitespace character: [ \t\n\x0B\f\r]
and escape other special characters with a leading \\
来源:https://stackoverflow.com/questions/14826143/string-replaceall-is-not-working