Collect HashSet / Java 8 / Regex Pattern / Stream API

后端 未结 6 802
日久生厌
日久生厌 2020-12-29 09:42

Recently I change version of the JDK 8 instead 7 of my project and now I overwrite some code snippets using new features that came with Java 8.

final Matcher         


        
6条回答
  •  南笙
    南笙 (楼主)
    2020-12-29 10:22

    A Matcher-based spliterator implementation can be quite simple if you reuse the JDK-provided Spliterators.AbstractSpliterator:

    public class MatcherSpliterator extends AbstractSpliterator
    {
      private final Matcher m;
    
      public MatcherSpliterator(Matcher m) {
        super(Long.MAX_VALUE, ORDERED | NONNULL | IMMUTABLE);
        this.m = m;
      }
    
      @Override public boolean tryAdvance(Consumer action) {
        if (!m.find()) return false;
        final String[] groups = new String[m.groupCount()+1];
        for (int i = 0; i <= m.groupCount(); i++) groups[i] = m.group(i);
        action.accept(groups);
        return true;
      }
    }
    

    Note that the spliterator provides all matcher groups, not just the full match. Also note that this spliterator supports parallelism because AbstractSpliterator implements a splitting policy.

    Typically you will use a convenience stream factory:

    public static Stream matcherStream(Matcher m) {
      return StreamSupport.stream(new MatcherSpliterator(m), false);
    }
    

    This gives you a powerful basis to concisely write all kinds of complex regex-oriented logic, for example:

    private static final Pattern emailRegex = Pattern.compile("([^,]+?)@([^,]+)");
    public static void main(String[] args) {
      final String emails = "kid@gmail.com, stray@yahoo.com, miks@tijuana.com";
      System.out.println("User has e-mail accounts on these domains: " +
          matcherStream(emailRegex.matcher(emails))
          .map(gs->gs[2])
          .collect(joining(", ")));
    }
    

    Which prints

    User has e-mail accounts on these domains: gmail.com, yahoo.com, tijuana.com
    

    For completeness, your code will be rewritten as

    Set set = matcherStream(mtr).map(gs->gs[0].toLowerCase()).collect(toSet());
    

提交回复
热议问题