Java regex throws java.util.regex.PatternSyntaxException: Illegal/unsupported escape sequence for the letter g

徘徊边缘 提交于 2021-02-17 05:01:11

问题


I need to see if a whole word exists in a string. This is how I try to do it:

if(text.matches(".*\\" + word + "\\b.*"))
    // do something

It's running for most words, but words that start with a g cause an error:

Exception in thread "main" java.util.regex.PatternSyntaxException:
Illegal/unsupported escape sequence near index 3 
.*\great life\b.*
   ^

How can I fix this?


回答1:


The \\ thing proceeded by whatever character will be a interpreted as a metacharacter. E.g. ".*\\geza\\b.*" will try to find the \g escape sequence, ".*\\jani\\b.*" will try to find \j, etc.

Some of these sequences exist, others don't, you can check the Pattern docs for details. What's really troubling is that probably this isn't what you want.

I agree with Thomas Ayoub that probably you need to match \\b...\\b to find a word. I would go one step further and I'd use Pattern.quote to avoid unintended regex features that might come from word:

String text = "Lorem Ipsum a[asd]a sad";
String word = "a[asd]a";
if (text.matches(".*\\b" + Pattern.quote(word) + "\\b.*")) {
    // do something
}



回答2:


The actual reason for the error is that you cannot escape an alphabetical character in a Java regex pattern that does not form a valid escape construct.

See Java regex documentation:

It is an error to use a backslash prior to any alphabetic character that does not denote an escaped construct; these are reserved for future extensions to the regular-expression language. A backslash may be used prior to a non-alphabetic character regardless of whether that character is part of an unescaped construct.

I'd use

Matcher m = Pattern.compile("\\b" + word + "\\b").matcher(text);
if (m.find()) {
    // A match is found
}

If a word may contain/start/end with special chars, I'd use

Matcher m = Pattern.compile("(?<!\\w)" + Pattern.quote(word) + "(?!\\w)").matcher(text);
if (m.find()) {
    // A match is found
}



回答3:


Using ".*\\" + word + "\\b.*" with word = great life will generate the string ".*\\great life\\b.*" which, as a value is .*\great life\b.*. The issue is that \g does not belong to the list of the escape sequences in JAVA (see What are all the escape characters in Java?)

You can use

if(text.matches(".*\\b" + word + "\\b.*"))
                     ^


来源:https://stackoverflow.com/questions/43616202/java-regex-throws-java-util-regex-patternsyntaxexception-illegal-unsupported-es

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!