Java Regex : match whole word with word boundary

☆樱花仙子☆ 提交于 2021-02-04 11:39:45

问题


I am trying to check whether a string contains a word as a whole, using Java. Below are some examples:

Text : "A quick brown fox"
Words:
"qui" - false
"quick" - true
"quick brown" - true
"ox" - false
"A" - true

Below is my code:

String pattern = "\\b(<word>)\\b";
String s = "ox";
String text = "A quick brown fox".toLowerCase();
System.out.println(Pattern.compile(pattern.replaceAll("<word>", s.toLowerCase())).matcher(text).find());

It works fine with strings like the one I mentioned in the above example. However, I get incorrect results if the input string has characters like %, ( etc, e.g.:

Text : "c14, 50%; something (in) bracket"
Words:
"c14, 50%;" : false
"(in) bracket" : false

It has something to do with my regex pattern (or maybe I am doing the entire pattern matching wrongly). Could anyone suggest me a better approach.


回答1:


It appears you only want to match "words" enclosed with whitespace (or at the start/end of strings).

Use

String pattern = "(?<!\\S)" + Pattern.quote(word) + "(?!\\S)";

The (?<!\S) negative lookbehind will fail all matches that are immediately preceded with a char other than a whitespace and (?!\s) is a negative lookahead that will fail all matches that are immediately followed with a char other than whitespace. Pattern.quote() is necessary to escape special chars that need to be treated as literal chars in the regex pattern.




回答2:


Try escape the special characters with the backslash. They can have other meanings in a pattern.

small correction: Probably you even need two backslash, since the backslash itself is a special character in a String.



来源:https://stackoverflow.com/questions/42904361/java-regex-match-whole-word-with-word-boundary

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!