How to interleave streams (with backpressure)

前端 未结 2 629
悲哀的现实
悲哀的现实 2021-01-02 03:18

Suppose I have two possibly infinite streams:

s1 = a..b..c..d..e...
s2 = 1.2.3.4.5.6.7...

I want to merge the streams and then map

2条回答
  •  慢半拍i
    慢半拍i (楼主)
    2021-01-02 04:01

    The core challenge here was to understand, how to formalise fairness. In the question I already mentioned worker analogy. Turned out that the obvious fairness criteria is to pick a stream that generated less events than others, or taken even further: whom generated streams waited for less time.

    After that it was quite trivial to formalise the desired output using denotational semantics: code is on GitHub

    I didn't had time to develop the denotational combinators to include withStateMachine from Bacon.js, so the next step was to reimplement it in JavaScript with Bacon.js directly. The whole runnable solution is available as a gist.

    The idea is to make a state machine with

    • per stream costs and queues as a state
    • streams and additional feedback stream as inputs

    As output of the whole system is feeded back, we can dequeue the next event when the previous flatMapped stream is ended.

    For that I had to make a bit ugly rec combinator

    function rec(f) {
      var bus = new Bacon.Bus();
      var result = f(bus);
      bus.plug(result);
      return result;
    }
    

    It's type is (EventStream a -> EventStream a) -> EventStream a - the type resembles other recursion combinators, e.g. fix.

    It can be made with better system-wide behaviour, as Bus breaks unsubscription propagation. We have to work on that.

    The Second helper function is stateMachine, which takes an array of streams and turns them into single state machine. Essentially it's .withStateMachine ∘ mergeAll ∘ zipWithIndex.

    function stateMachine(inputs, initState, f) {
      var mapped = inputs.map(function (input, i) {
        return input.map(function (x) {
          return [i, x];
        })
      });
      return Bacon.mergeAll(mapped).withStateMachine(initState, function (state, p) {
        if (p.hasValue()) {
          p = p.value();
          return f(state, p[0], p[1]);
        } else {
          return [state, p];
        }
      });
    }
    

    Using this two helpers we can write a not-so-complex fair scheduler:

    function fairScheduler(streams, fn) {
      var streamsCount = streams.length;
      return rec(function (res) {
        return stateMachine(append(streams, res), initialFairState(streamsCount), function (state, i, x) {
          // console.log("FAIR: " + JSON.stringify(state), i, x);
    
          // END event
          if (i == streamsCount && x.end) {
            var additionalCost = new Date().getTime() - x.started;
    
            // add cost to input stream cost center
            var updatedState = _.extend({}, state, {
              costs: updateArray(
                state.costs,
                x.idx, function (cost) { return cost + additionalCost; }),
            });
    
            if (state.queues.every(function (q) { return q.length === 0; })) {
              // if queues are empty, set running: false and don't emit any events
              return [_.extend({}, updatedState, { running: false }), []];
            } else {
              // otherwise pick a stream with
              // - non-empty queue
              // - minimal cost
              var minQueueIdx = _.chain(state.queues)
                .map(function (q, i) {
                  return [q, i];
                })
                .filter(function (p) {
                  return p[0].length !== 0;
                })
                .sortBy(function (p) {
                  return state.costs[p[1]];
                })
                .value()[0][1];
    
              // emit an event from that stream
              return [
                _.extend({}, updatedState, {
                  queues: updateArray(state.queues, minQueueIdx, function (q) { return q.slice(1); }),
                  running: true,
                }),
                [new Bacon.Next({
                  value: state.queues[minQueueIdx][0],
                  idx: minQueueIdx,
                })],
              ];
            }
          } else if (i < streamsCount) {
            // event from input stream
            if (state.running) {
              // if worker is running, just enquee the event
              return [
                _.extend({}, state, {
                  queues: updateArray(state.queues, i, function (q) { return q .concat([x]); }),
                }),
                [],
              ];
            } else {
              // if worker isn't running, start it right away
              return [
                _.extend({}, state, {
                  running: true,
                }),
                [new Bacon.Next({ value: x, idx: i})],
              ]
            }
          } else {
            return [state, []];
          }
    
        })
        .flatMapConcat(function (x) {
          // map passed thru events,
          // and append special "end" event
          return fn(x).concat(Bacon.once({
            end: true,
            idx: x.idx,
            started: new Date().getTime(),
          }));
        });
      })
      .filter(function (x) {
        // filter out END events
        return !x.end;
      })
      .map(".value"); // and return only value field
    }
    

    Rest of the code in the gist is quite straight-forward.

提交回复
热议问题