Is Reactive Extensions evaluating too many times?

问题

How many times is Reactive Extensions supposed to evaluate its various operators?

I have the following test code:

var seconds = Observable
    .Interval(TimeSpan.FromSeconds(5))
    .Do(_ => Console.WriteLine("{0} Generated Data", DateTime.Now.ToLongTimeString()));

var split = seconds
    .Do(_ => Console.WriteLine("{0}  Split/Branch received Data", DateTime.Now.ToLongTimeString()));

var merged = seconds
    .Merge(split)
    .Do(_ => Console.WriteLine("{0}   Received Merged data", DateTime.Now.ToLongTimeString()));

var pipeline = merged.Subscribe();

I expect this to write "Generated Data" once every five seconds. It then hands off that data to both the 'split' stream which writes "Split/Branch received Data", and to the 'merged' stream which writes "Received Merged data". Last, because the 'merged' stream is also receiving from the 'split' stream, it receives the data second time and writes "Received Merged data" a second time. (The order it writes some of them in is not particularly relevant)

But the output I am getting is this:

8:29:56 AM Generated Data
8:29:56 AM Generated Data
8:29:56 AM  Split/Branch received Data
8:29:56 AM   Received Merged data
8:29:56 AM   Received Merged data
8:30:01 AM Generated Data
8:30:01 AM Generated Data
8:30:01 AM  Split/Branch received Data
8:30:01 AM   Received Merged data
8:30:01 AM   Received Merged data

It is writing "Generaged Data" twice. To my understanding, The number of downstream observers that are subscribed to the "seconds" IObservable should not affect the number of times that "Generated Data" writes (which should be ONCE), but it does. Why?

NOTE I am using the stable release v1.0 SP1 of reactive extensions in a .Net Framework 3.5 environment.

回答1:

Presumably, they choose that approach to allow each subscriber to get its values at the same interval from their initial subscription. Consider how your alternate Interval would work:

0s - First subscriber subscribes
5s - Value: 0
8s - Second subscriber subscribes
10s - Value: 1
15s - Value: 2
17s - Unsubscribe both

What you end up with is something like this:

First  -----0----1----2-|
Second         --1----2-|

In this case, the two observers have noticeably different results depending on whether or not there is any other observer already attached. As it is implemented, Interval gives the same experience to each subscriber regardless of order or past subscribers.

All that said, you can "convert" Interval to the behavior you describe by adding .Publish().RefCount() when creating the seconds observable.

回答2:

Although it seems like it might be nice sometimes if sequences were multicasted at every step, if it were that way it wouldn't allow you to have the rich composition that Rx allows for.

To think of it another way, IObservable is the push-based dual of IEnumerable. IEnumerable has the property of lazy evaluation - the values aren't computed until you start moving through the Enumerator. Rx sequences are composed lazily and finally a Subscribe() (the Observable equivalent of For-Each) realises the sequence.

This way you can stop the pipeline at all stages simply by un-subscribing from the last stage, allowing you to have fire and forget behavior without going through the nightmare of managing individual subscriptions.

回答3:

On a related note, here's a brainteaser for you, illustrating Asti's analogy with lazily evaluated enumerable sequences:

private static Random s_rand = new Random();

public static IEnumerable<int> Rand()
{
    while (true)
        yield return s_rand.Next();
}

public static void Main()
{
    var xs = Rand();

    var res = xs.Zip(xs, (l, r) => l == r).All(b => b);

    Console.WriteLine(res);
}

If you Zip a random sequence with itself, do you expect all pairs of elements to be the same (i.e. resulting in the code above to run forever)? Or, do you expect the code to terminate and print false for some reason?

(Creating the analogous observable code is left as an exercise for the reader.)

回答4:

From an object-oriented perspective, it's normal for thinking about streams to be grounded in the interfaces that define Observables/Enumerables. If you can ignore the fact that there's a convenience method called Reset defined on the Enumerator - Enumerables are functionally speaking f -> g -> value?. An enumerable is essentially a function you call to get the enumerator which is essentially a function you keep calling until there are no more values to be returned.

Similarly an Observable is simply be defined as f(g) -> g(h) -> h(value?) - it's a function you supply with the function you want called when there is a value.

This is why makes no sense to describe an enumerable or observable as anything but a set of functions defined in a certain way so they can be composed - the contracts are to ensure the ability to compose computations.

Whether they're live, cached or lazy are implementation details which may be abstracted elsewhere - and while I certainly don't disagree that these details are important it's more important to focus on the functional nature of it.

A sequence which is a database query or a directory listing has the same IEnumerable interface as a pre-computed set of values (like an Array). It's up to the code which finally consumes the sequence to make that distinction. If you can get used to the notion that it's a way to compose higher order functions, you'll find it easier to model a problem using Rx or Ix.

来源：https://stackoverflow.com/questions/12319204/is-reactive-extensions-evaluating-too-many-times

标签

system.reactive