How do I share an observable with publish and connect?

后端 未结 2 1315
-上瘾入骨i
-上瘾入骨i 2020-12-20 19:29

I have an observable data stream that I am applying operations to, splitting into two separate streams, applying more (distinct) operations to each of the two streams, and m

相关标签:
2条回答
  • 2020-12-20 19:49

    You have published the wrong observable.

    With the current code you are merging and then publishing like this Observable.Merge(a, b).Publish();. Now since a & b are defined against expensive you still get two subscriptions to expensive.

    The subscriptions create these pipelines:

    You can see this if you take out the .Publish(); from your code. The output becomes:

    Doing an expensive operation
    Doing an expensive operation
    Doing an expensive operation
    Doing an expensive operation
    Subscriber A got: { Source = A, Value = #0 }
    Doing an expensive operation
    Doing an expensive operation
    Doing an expensive operation
    Doing an expensive operation
    Subscriber B got: { Source = B, Value = #1 }
    

    This creates these pipelines:

    So, by shifting the .Publish() back up to expensive you eliminate the problem. That's where you really needed it because it is the expensive operation after all.

    This is the code you needed:

    var foregroundScheduler = new NewThreadScheduler(ts => new Thread(ts) { IsBackground = false });
    var timer = Observable.Timer(TimeSpan.Zero, TimeSpan.FromSeconds(10), foregroundScheduler);
    var expensive = timer.Select(i =>
    {
        // Converting to strings is an expensive operation
        Console.WriteLine("Doing an expensive operation");
        return string.Format("#{0}", i);
    });
    
    var connectable = expensive.Publish();
    
    var a = connectable.Where(s => int.Parse(s.Substring(1)) % 2 == 0).Select(s => new { Source = "A", Value = s });
    var b = connectable.Where(s => int.Parse(s.Substring(1)) % 2 != 0).Select(s => new { Source = "B", Value = s });
    
    var merged = Observable.Merge(a, b);
    
    merged.Where(x => x.Source.Equals("A")).Subscribe(s => Console.WriteLine("Subscriber A got: {0}", s));
    merged.Where(x => x.Source.Equals("B")).Subscribe(s => Console.WriteLine("Subscriber B got: {0}", s));
    
    connectable.Connect();
    

    That nicely produces the following:

    Doing an expensive operation
    Subscriber A got: { Source = A, Value = #0 }
    Doing an expensive operation
    Subscriber B got: { Source = B, Value = #1 }
    Doing an expensive operation
    Subscriber A got: { Source = A, Value = #2 }
    Doing an expensive operation
    Subscriber B got: { Source = B, Value = #3 }
    

    And this gives you these pipelines:

    You can see from this image that there is still duplication. That's fine because these parts aren't expensive.

    The duplication is actually important. Shared parts of the pipelines make their endpoints vulnerable to errors and thus to early termination. The less sharing the better for the robustness of the code. It's only when you have an expensive operation that you should worry about publishing. Otherwise you should just let the pipelines be themselves.

    Here's an example to show it. If you don't have a published source then, if one source produces an error then it doesn't pull down all of the pipelines.

    But once you introduce a shared observable then a single error will bring down all of the pipelines.

    0 讨论(0)
  • 2020-12-20 19:56

    One possible fix:

    var foregroundScheduler = new NewThreadScheduler(ts => new Thread(ts) { IsBackground = false });
    var timer = Observable.Timer(TimeSpan.Zero, TimeSpan.FromSeconds(10), foregroundScheduler);
    var expensive = timer.Select(i =>
    {
        // Converting to strings is an expensive operation
        Console.WriteLine("Doing an expensive operation");
        return string.Format("#{0}", i);
    });
    
    var subj = new ReplaySubject<string>();
    expensive.Subscribe(subj);
    
    var a = subj.Where(s => int.Parse(s.Substring(1)) % 2 == 0).Select(s => new { Source = "A", Value = s });
    var b = subj.Where(s => int.Parse(s.Substring(1)) % 2 != 0).Select(s => new { Source = "B", Value = s });
    
    var merged = Observable.Merge(a, b);
    merged.Where(x => x.Source.Equals("A")).Subscribe(s => Console.WriteLine("Subscriber A got: {0}", s));
    merged.Where(x => x.Source.Equals("B")).Subscribe(s => Console.WriteLine("Subscriber B got: {0}", s));
    

    The above example essentially creates a new intermediate observable that emits the results of the expensive operation. This allows you to subscribe to the results of the expensive operation, not to an expensive transformation applied to a timer.

    With this you'll see:

    Doing an expensive operation
    Subscriber A got: { Source = A, Value = #0 }
    Doing an expensive operation
    Subscriber B got: { Source = B, Value = #1 }
    

    (Output continues, truncated for brevity.)

    Alternatively, you could move the calls to Publish and Connect:

    var foregroundScheduler = new NewThreadScheduler(ts => new Thread(ts) {IsBackground = false});
    var timer = Observable.Timer(TimeSpan.Zero, TimeSpan.FromSeconds(10), foregroundScheduler);
    var expensive = timer.Select(i =>
    {
        // Converting to strings is an expensive operation
        Console.WriteLine("Doing an expensive operation");
        return string.Format("#{0}", i);
    }).Publish();
    
    var a = expensive.Where(s => int.Parse(s.Substring(1)) % 2 == 0).Select(s => new { Source = "A", Value = s });
    var b = expensive.Where(s => int.Parse(s.Substring(1)) % 2 != 0).Select(s => new { Source = "B", Value = s });
    
    var merged = Observable.Merge(a, b);
    merged.Where(x => x.Source.Equals("A")).Subscribe(s => Console.WriteLine("Subscriber A got: {0}", s));
    merged.Where(x => x.Source.Equals("B")).Subscribe(s => Console.WriteLine("Subscriber B got: {0}", s));
    
    expensive.Connect();
    

    Why ReplaySubject, not just Subject or some other subject?

    A Subject, in the .NET Rx implementation is by default what the ReactiveX documentation calls a PublishSubject, which emits to an observer only those items that are emitted by the source Observable subsequent to the time of the subscription. A ReplaySubject on the other hand, emits to any observer all of the items that were emitted by the source Observable, regardless of when the observer subscribes. If we use a plain subject in the first example, the subscription of subj to the timer will cause subscriptions to subj to miss anything emitted between the time that the subject subscribes to the expensive operation and the time that they subscribe to the intermediate subject (subj).

    0 讨论(0)
提交回复
热议问题