Partition a Stream by a discriminator function

依然范特西╮ 提交于 2019-12-17 15:45:39

问题


One of the missing features in the Streams API is the "partition by" transformation, for example as defined in Clojure. Say I want to reproduce Hibernate's fetch join: I want to issue a single SQL SELECT statement to receive this kind of objects from the result:

class Family {
   String surname;
   List<String> members;
}

I issue:

SELECT f.name, m.name 
FROM Family f JOIN Member m on m.family_id = f.id
ORDER BY f.name

and I retrieve a flat stream of (f.name, m.name) records. Now I need to transform it into a stream of Family objects, with a list of its members inside. Assume I already have a Stream<ResultRow>; now I need to transform it into a Stream<List<ResultRow>> and then act upon that with a mapping transformation which turns it into a Stream<Family>.

The semantics of the transformation are as follows: keep collecting the stream into a List for as long as the provided discriminator function keeps returning the same value; as soon as the value changes, emit the List as an element of the output stream and start collecting a new List.

I hope to be able to write this kind of code (I already have the resultStream method):

Stream<ResultRow> dbStream = resultStream(queryBuilder.createQuery(
        "SELECT f.name, m.name"
      + " FROM Family f JOIN Member m on m.family_id = f.id"
      + " ORDER BY f.name"));
Stream<List<ResultRow> partitioned = partitionBy(r -> r.string(0), dbStream);
Stream<Family> = partitioned.map(rs -> {
                    Family f = new Family(rs.get(0).string(0));
                    f.members = rs.stream().map(r -> r.string(1)).collect(toList());
                    return f;
                 });

Needless to say, I expect the resulting stream to stay lazy (non-materialized) as I want to be able to process a result set of any size without hitting any O(n) memory limits. Without this crucial requirement I would be happy with the provided groupingBy collector.


回答1:


The solution requires us to define a custom Spliterator which can be used to construct the partitioned stream. We shall need to access the input stream through its own spliterator and wrap it into ours. The output stream is then constructed from our custom spliterator.

The following Spliterator will turn any Stream<E> into a Stream<List<E>> provided a Function<E, ?> as the discriminator function. Note that the input stream must be ordered for this operation to make sense.

public class PartitionBySpliterator<E> extends AbstractSpliterator<List<E>> {
  private final Spliterator<E> spliterator;
  private final Function<? super E, ?> partitionBy;
  private HoldingConsumer<E> holder;
  private Comparator<List<E>> comparator;

  public PartitionBySpliterator(Spliterator<E> toWrap, Function<? super E, ?> partitionBy) {
    super(Long.MAX_VALUE, toWrap.characteristics() & ~SIZED | NONNULL);
    this.spliterator = toWrap;
    this.partitionBy = partitionBy;
  }

  public static <E> Stream<List<E>> partitionBy(Function<E, ?> partitionBy, Stream<E> in) {
    return StreamSupport.stream(new PartitionBySpliterator<>(in.spliterator(), partitionBy), false);
  }

  @Override public boolean tryAdvance(Consumer<? super List<E>> action) {
    final HoldingConsumer<E> h;
    if (holder == null) {
      h = new HoldingConsumer<>();
      if (!spliterator.tryAdvance(h)) return false;
      holder = h;
    }
    else h = holder;
    final ArrayList<E> partition = new ArrayList<>();
    final Object partitionKey = partitionBy.apply(h.value);
    boolean didAdvance;
    do partition.add(h.value);
    while ((didAdvance = spliterator.tryAdvance(h))
        && Objects.equals(partitionBy.apply(h.value), partitionKey));
    if (!didAdvance) holder = null;
    action.accept(partition);
    return true;
  }

  static final class HoldingConsumer<T> implements Consumer<T> {
    T value;
    @Override public void accept(T value) { this.value = value; }
  }

  @Override public Comparator<? super List<E>> getComparator() {
    final Comparator<List<E>> c = this.comparator;
    return c != null? c : (this.comparator = comparator());
  }

  private Comparator<List<E>> comparator() {
    @SuppressWarnings({"unchecked","rawtypes"})
    final Comparator<? super E> innerComparator =
        Optional.ofNullable(spliterator.getComparator())
                .orElse((Comparator) naturalOrder());
    return (left, right) -> {
      final int c = innerComparator.compare(left.get(0), right.get(0));
      return c != 0? c : innerComparator.compare(
          left.get(left.size() - 1), right.get(right.size() - 1));
    };
  }
}



回答2:


For those of you who just want to partition a stream, there are mappers and collectors for that.

class Person {

    String surname;
    String forename;

    public Person(String surname, String forename) {
        this.surname = surname;
        this.forename = forename;
    }

    @Override
    public String toString() {
        return forename;
    }

}

class Family {

    String surname;
    List<Person> members;

    public Family(String surname, List<Person> members) {
        this.surname = surname;
        this.members = members;
    }

    @Override
    public String toString() {
        return "Family{" + "surname=" + surname + ", members=" + members + '}';
    }

}

private void test() {
    String[][] data = {
        {"Kray", "Ronald"},
        {"Kray", "Reginald"},
        {"Dors", "Diana"},};
    // Their families.
    Stream<Family> families = Arrays.stream(data)
            // Build people
            .map(a -> new Person(a[0], a[1]))
            // Collect into a Map<String,List<Person>> as families
            .collect(Collectors.groupingBy(p -> p.surname))
            // Convert them to families.
            .entrySet().stream()
            .map(p -> new Family(p.getKey(), p.getValue()));
    families.forEach(f -> System.out.println(f));
}



回答3:


It can be done by collapse with StreamEx

StreamEx.of(queryBuilder.createQuery(
    "SELECT f.name, m.name"
    + " FROM Family f JOIN Member m on m.family_id = f.id"
    + " ORDER BY f.name"))
        .collapse((a, b) -> a.string(0).equals(b.string(0)), Collectors.toList())
        .map(l -> new Family(l.get(0).string(0), StreamEx.of(l).map(r -> r.string(1)).toList())) 
        .forEach(System.out::println);


来源:https://stackoverflow.com/questions/28363323/partition-a-stream-by-a-discriminator-function

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!