Non-greedy Regular Expression in Java

十年热恋 提交于 2019-12-09 15:17:47

问题


I have next code:

public static void createTokens(){
    String test = "test is a word word word word big small";
    Matcher mtch = Pattern.compile("test is a (\\s*.+?\\s*) word (\\s*.+?\\s*)").matcher(test);
    while (mtch.find()){
        for (int i = 1; i <= mtch.groupCount(); i++){
            System.out.println(mtch.group(i));
        }
    }
}

And have next output:

word
w

But in my opinion it must be:

word
word

Somebody please explain me why so?


回答1:


Because your patterns are non-greedy, so they matched as little text as possible while still consisting of a match.

Remove the ? in the second group, and you'll get
word
word word big small

Matcher mtch = Pattern.compile("test is a (\\s*.+?\\s*) word (\\s*.+\\s*)").matcher(test);



回答2:


By using \\s* it will match any number of spaces including 0 spaces. w matches (\\s*.+?\\s*). To make sure it matches a word separated by spaces try (\\s+.+?\\s+)



来源:https://stackoverflow.com/questions/8931183/non-greedy-regular-expression-in-java

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!