Iterating an IAsyncEnumerable in a function returning an IAsyncEnumerable with cancellation

◇◆丶佛笑我妖孽 提交于 2020-12-09 05:10:39

问题


As the title says, I have to following function:

public async IAsyncEnumerable<Job> GetByPipeline(int pipelineId,
    [EnumeratorCancellation] CancellationToken cancellationToken = default)
{
    await foreach (var job in context.Jobs.Where(job => job.Pipeline.Id == pipelineId)
        .AsAsyncEnumerable()
        .WithCancellation(cancellationToken)
        .ConfigureAwait(false))
    {
        yield return job;
    }
}

I have trouble wrapping my head around where the cancellation token is going, and a nagging feeling that I am using it in too many places.

What is actually happening here when you deconstruct all the fancy async stuff? And are there any better ways to write this function?


回答1:


For starters, this method could be reduced to :

public IAsyncEnumerable<Job> GetByPipeline(int pipelineId)
{
    return context.Jobs
                  .Where(job => job.Pipeline.Id == pipelineId)
                  .AsAsyncEnumerable();
}

or even

public IAsyncEnumerable<Job> GetByPipeline(int pipelineId)
    => context.Jobs
              .Where(job => job.Pipeline.Id == pipelineId)
              .AsAsyncEnumerable();

The method doesn't do anything with job so it doesn't need to iterate over it.

Cancellation

What if the method actually used job, where should the cancellation token be used?

Let's clean up the method a bit. The equivalent is:

public async IAsyncEnumerable<Job> GetByPipeline(
      int pipelineId, 
      [EnumeratorCancellation] CancellationToken ct = default)
{
    //Just a query, doesn't execute anything
    var query =context.Jobs.Where(job => job.Pipeline.Id == pipelineId);

    //Executes the query and returns the *results* as soon as they arrive in an async stream
    var jobStream=query.AsAsyncEnumerable();

    //Process the results from the async stream as they arrive
    await foreach (var job in jobStream.WithCancellation(ct).ConfigureAwait(false))
    {
        //Does *that* need cancelling?
        DoSometingExpensive(job);
    }
}

The IQueryable query doesn't run anything, it represents the query. It doesn't need cancellation.

AsAsyncEnumerable(), AsEnumerable(), ToList() etc execute the query and return some result. ToList() etc consume all the results while the As...Enumerable() methods produce results only when requested. The query can't be cancelled, the As_Enumerable() methods won't return anything unless asked for it, so they don't need cancellation.

await foreach will iterate over the entire async stream so if we want to be able to abort it, we do need to pass the cancellation token.

Finally, does DoSometingExpensive(job); need cancellation? Is it so expensive that we want to be able to break out of it if it takes too long? Or can we wait until it's finished before exiting the loop? If it needs cancellation, it will need the CancellationToken too.

ConfigureAwait

Finally, ConfigureAwait(false) isn't involved in cancellation, and may not be needed at all. Without it, after each await execution returns to the original synchronization context. In a desktop application, this meant the UI thread. That's what allows us to modify the UI in an async event handler.

If GetByPipeline runs on a desktop app and wanted to modify the UI, it would have to remove ConfugureAwait :

await foreach (var job in jobStream.WithCancellation(ct))
{
        //Update the UI
        toolStripProgressBar.Increment(1);
        toolStripStatusLabel.Text=job.Name;
        //Do the actual job
        DoSometingExpensive(job);
}

With ConfigureAwait(false), execution continues on a threadpool thread and we can't touch the UI.

Library code shouldn't affect how execution resumes, so most libraries use ConfigureAwait(false) and leave the final decision to the UI developer.

If GetByPipeline is a library method, do use ConfigureAwait(false).




回答2:


Imagine that somewhere deep inside the Entity Framework is the method GetJobs that retrieves the Job objects form the database:

private static async IAsyncEnumerable<Job> GetJobs(DbDataReader dataReader,
    [EnumeratorCancellation]CancellationToken cancellationToken = default)
{
    while (await dataReader.ReadAsync(cancellationToken))
    {
        yield return new Job()
        {
            Id = (int)dataReader["Id"],
            Data = (byte[])dataReader["Data"]
        };
    }
}

Now imagine that the Data property contains a huge byte array with data accosiated with the Job. Retrieving the array of each Job may take some non-trivial amount of time. In this case breaking the loop between iterations would not be enough, because there would be a noticable delay between invoking the Cancel method and the raising of the OperationCanceledException. This is why the method DbDataReader.ReadAsync needs a CancellationToken, so that the query can be canceled instantly.

The challenge now is how to pass the CancellationToken passed by the client code to the GetJobs method, when a property like context.Jobs is along the way. The solution is the WithCancellation extension method, that stores the token and passes it deeper, to a method accepting an argument decorated with the EnumeratorCancellation attribute.

So in your case you have done everything correctly. You have included a cancellationToken argument in your IAsyncEnumerable returning method, which is the recommended practice. This way subsequent WithCancellation chained to your GetByPipeline method will not be wasted. Then you chained the WithCancellation after the AsAsyncEnumerable inside your method, which is also correct. Otherwise the CancellationToken would not reach its final destination, the GetJobs method.



来源:https://stackoverflow.com/questions/58757843/iterating-an-iasyncenumerable-in-a-function-returning-an-iasyncenumerable-with-c

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!