问题
I am trying to create one producer multiple consumers model with streams in Java8. I am reading and processing data from DB resource and I want to process them in streaming fashion way(can not read the whole resource into memory).
The reading of the source has to be single threaded (the cursors is not thread safe) and reading is fast, than the processing of each data chunks which is heavy operation can run in parallel.
I haven't found out how can I join (interconnect) non-parallel stream with parallel stream processing. Is there any way how to do it with Java8 stream API ?
Example of code:
This iterator has to run in single thread because cursor is not thread safe.
class SimpleIterator<Data> implements Iterator<Data>{
private volatile Cursor cursor;
public SimpleIterator(Cursor cursor){
this.cursor = cursor;
}
@Override
public boolean hasNext() {
return cursor.hasNext();
}
@Override
public Data next() {
return cursor.next();
}
}
//create the non-paralel stream
SimpleIterator<Data> iterator = new SimpleIterator<>(queryCursor);
Iterable<Data> iterable = () -> iterator;
Stream<Data> resultStream = StreamSupport.stream(iterable.spliterator(), false); // prallel set as false
//process data for each data should run in parallel
resultStream.parallel().forEach(data->processData(data));
public processData(Data data){
//heavy operation
}
But if I set stream as parallel before calling forEach than the whole stream is parallel and also the iterator is calling in multiple threads. Is there any way how to interconnect this two streams in Java8 or I have to create some queue that will provide data from single threaded producer stream to parallel stream.
回答1:
I am working on a problem where I need to do a full outer join on two streams. The problems appear to be similar. What I do is insert two blocking queues to buffer my input. I think you can do something similar with one blocking queue to split a single stream into multiple streams without parallelizing the source stream.
The solution I propose can be found below. I have not tested my solution for joining two streams yet, so I am not certain this works. The AbstractSpliterator class has an implementation of trySplit; the comments on trySplit are informative. The final method of the class constructs a parallelizable stream from the spliterator implementation.
import java.util.Spliterators;
import java.util.concurrent.ArrayBlockingQueue;
import java.util.concurrent.BlockingQueue;
import java.util.function.Consumer;
import java.util.stream.Stream;
public class StreamSplitter<T> extends Spliterators.AbstractSpliterator<T> {
final T EOS = null; // Just a stub -- can't put a null in BlockingQueue
private final BlockingQueue<T> queue;
private final Thread thread;
// An implementation of Runnable that fills a queue from a stream
private class Filler implements Runnable {
private final Stream<T> stream;
private final BlockingQueue<T> queue;
private Filler(Stream<T> stream, BlockingQueue<T> queue) {
this.stream = stream;
this.queue = queue;
}
@Override
public void run() {
stream.forEach(x -> {
try {
// Blocks if the queue is full
queue.put(x);
} catch (InterruptedException e) {
e.printStackTrace();
}
});
// Stream is drained put end of stream marker.
try {
queue.put(EOS);
} catch (InterruptedException e) {
e.printStackTrace();
}
}
}
private StreamSplitter(long estSize, int characteristics, Stream<T> srcStream) {
super(estSize, characteristics);
queue = new ArrayBlockingQueue<T>(1024);
// Fill the queue from a separate thread (may want to externalize this).
thread = new Thread(new Filler(srcStream, queue));
thread.start();
}
@Override
public boolean tryAdvance(Consumer<? super T> action) {
try {
T value = queue.take(); // waits (blocks) for entries in queue
// If end of stream marker is found, return false signifying
// that the stream is finished.
if (value == EOS) {
return false;
}
// Accept the next value.
action.accept(value);
} catch (InterruptedException e) {
return false;
}
return true;
}
public static <T> Stream<T> splitStream(long estSize, int characteristics, Stream<T> srcStream) {
Spliterator<T> spliterator = new StreamSplitter<T>(estSize, characteristics, srcStream);
return StreamSupport.stream(spliterator, true);
}
}
来源:https://stackoverflow.com/questions/36791761/how-to-interconnect-non-parallel-stream-with-parallel-streamone-producer-multip