问题
I need to see if a whole word exists in a string. This is how I try to do it:
if(text.matches(".*\\" + word + "\\b.*"))
// do something
It's running for most words, but words that start with a g
cause an error:
Exception in thread "main" java.util.regex.PatternSyntaxException:
Illegal/unsupported escape sequence near index 3
.*\great life\b.*
^
How can I fix this?
回答1:
The \\
thing proceeded by whatever character will be a interpreted as a metacharacter. E.g. ".*\\geza\\b.*"
will try to find the \g
escape sequence, ".*\\jani\\b.*"
will try to find \j
, etc.
Some of these sequences exist, others don't, you can check the Pattern docs for details. What's really troubling is that probably this isn't what you want.
I agree with Thomas Ayoub that probably you need to match \\b...\\b
to find a word. I would go one step further and I'd use Pattern.quote
to avoid unintended regex features that might come from word
:
String text = "Lorem Ipsum a[asd]a sad";
String word = "a[asd]a";
if (text.matches(".*\\b" + Pattern.quote(word) + "\\b.*")) {
// do something
}
回答2:
The actual reason for the error is that you cannot escape an alphabetical character in a Java regex pattern that does not form a valid escape construct.
See Java regex documentation:
It is an error to use a backslash prior to any alphabetic character that does not denote an escaped construct; these are reserved for future extensions to the regular-expression language. A backslash may be used prior to a non-alphabetic character regardless of whether that character is part of an unescaped construct.
I'd use
Matcher m = Pattern.compile("\\b" + word + "\\b").matcher(text);
if (m.find()) {
// A match is found
}
If a word may contain/start/end with special chars, I'd use
Matcher m = Pattern.compile("(?<!\\w)" + Pattern.quote(word) + "(?!\\w)").matcher(text);
if (m.find()) {
// A match is found
}
回答3:
Using ".*\\" + word + "\\b.*"
with word = great life
will generate the string ".*\\great life\\b.*"
which, as a value is .*\great life\b.*
. The issue is that \g
does not belong to the list of the escape sequences in JAVA (see What are all the escape characters in Java?)
You can use
if(text.matches(".*\\b" + word + "\\b.*"))
^
来源:https://stackoverflow.com/questions/43616202/java-regex-throws-java-util-regex-patternsyntaxexception-illegal-unsupported-es