How to match repeated patterns?

后端 未结 4 1923
我在风中等你
我在风中等你 2020-12-08 14:59

I would like to match:

some.name.separated.by.dots

But I don\'t have any idea how.

I can match a single part like this

         


        
相关标签:
4条回答
  • 2020-12-08 15:17

    You can use ? to match 0 or 1 of the preceeding parts, * to match 0 to any amount of the preceeding parts, and + to match at least one of the preceeding parts.

    So (\w\.)? will match w. and a blank, (\w\.)* will match r.2.5.3.1.s.r.g.s. and a blank, and (\w\.)+ will match any of the above but not a blank.

    If you want to match something like your example, you'll need to do (\w+\.)+, which means 'match at least one non whitespace, then a period, and match at least one of these'.

    0 讨论(0)
  • 2020-12-08 15:33

    Try the following:

    \w+(\.\w+)+
    

    The + after ( ... ) tell it to match what is inside the parenthesis one or more times.

    Note that \w only matches ASCII characters, so a word like café wouldn't be matches by \w+, let alone words/text containing Unicode.

    EDIT

    The difference between [...] and (...) is that [...] always matches a single character. It is called a "character set" or "character class". So, [abc] does not match the string "abc", but matches one of the characters a, b or c.

    The fact that \w+[\.\w+]* also matches your string is because [\.\w+] matches a . or a character from \w, which is then repeated zero or more time by the * after it. But, \w+[\.\w+]* will therefor also match strings like aaaaa or aaa............

    The (...) is, as I already mentioned, simply used to group characters (and possible repeat those groups).

    More info on character sets: http://www.regular-expressions.info/charclass.html

    More info on groups: http://www.regular-expressions.info/brackets.html

    EDIT II

    Here's an example in Java (seeing you post mostly Java answers):

    import java.util.regex.Matcher;
    import java.util.regex.Pattern;
    
    public class Main {
        public static void main(String[] args) {
            String text = "some.text.here only but not Some other " + 
                    "there some.name.separated.by.dots and.we are done!";
            Pattern p = Pattern.compile("\\w+(\\.\\w+)+");
            Matcher m = p.matcher(text);
            while(m.find()) {
                System.out.println(m.group());
            }
        }
    }
    

    which will produce:

    some.text.here
    some.name.separated.by.dots
    and.we
    

    Note that m.group(0) and m.group() are equivalent: meaning "the entire match".

    0 讨论(0)
  • 2020-12-08 15:35
    (\w+\.)+
    

    Apparently, the body has to be at least 30 characters. I hope this is enough.

    0 讨论(0)
  • 2020-12-08 15:39

    This will also work:

    (\w+(\.|$))+
    
    0 讨论(0)
提交回复
热议问题