Regexp grouping and replaceAll with .* in Java duplicates the replacement

前端 未结 3 2045
故里飘歌
故里飘歌 2021-01-18 17:28

I got a problem using Rexexp in Java. The example code writes out ABC_012_suffix_suffix, I was expecting it to output ABC_012_suffix



        
3条回答
  •  自闭症患者
    2021-01-18 17:36

    Pattern regexp  = Pattern.compile(".*");
    Matcher matcher = regexp.matcher("ABC_012");
    matcher.matches();
    System.out.println(matcher.group(0));
    System.out.println(matcher.replaceAll("$0_suffix"));
    

    Same happens here, the output is:

    ABC_012
    ABC_012_suffix_suffix
    

    The reason is hidden in the replaceAll method: it tries to find all subsequences that match the pattern:

    while (matcher.find()) {
      System.out.printf("Start: %s, End: %s%n", matcher.start(), matcher.end());
    }
    

    This will result in:

    Start: 0, End: 7
    Start: 7, End: 7
    

    So, to our first surprise, the matcher finds two subsequences, "ABC_012" and another "". And it appends "_suffix" to both of them:

    "ABC_012" + "_suffix" + "" + "_suffix"
    

提交回复
热议问题