parallel.foreach and httpclient - strange behaviour

若如初见. 提交于 2020-05-26 03:57:38

问题


I have a piece of code that loops over a collection and calls httpclient for each iteration. The api that the httpclient calls, takes on average 30-40ms to execute. Calling it sequentially, I get the expected outcome, however as soon as I use Parallel.foreach, it takes longer. Looking closely in the logs, I can see quite a few httpclient calls take more 1000ms to execute and then the time drops back to 30-40ms. Looking in the api logs, I can see it barely goes over 100ms. I am not sure why I get this spike.

The code is

using (var client = new HttpClient())
{
  var content = new StringContent(parameters, Encoding.UTF8, "application/json");
  var response = client.PostAsync(url, content);
  _log.Info(string.Format("Took {0} ms to send post", watch.ElapsedMilliseconds));
  watch.Restart();

  var responseString = response.Result.Content.ReadAsStringAsync();
  _log.Info(string.Format("Took {0} ms to readstring after post", watch.ElapsedMilliseconds));
}

The parallel call is something like this

    Console.WriteLine("starting parallel...");
    Parallel.ForEach(recipientCollections, recipientCollection => 
      {    
        // A lot of processing happens here to create relevant content
        var secondaryCountryRecipientList = string.Join(",",refinedCountryRecipients);
        var emailApiParams = new SendEmailParametersModel(CountrySubscriberApplicationId,
                                        queueItem.SitecoreId, queueItem.Version, queueItem.Language, countryFeedItem.Subject,
                                        countryFeedItem.Html, countryFeedItem.From, _recipientsFormatter.Format(secondaryCountryRecipientList));

       log.Info(string.Format("Sending email request for {0}. Recipients {1}",                                        queueItem.SitecoreId, secondaryCountryRecipientList));

        var response = _notificationsApi.Invoke(emailApiParams);
        });

thanks


回答1:


By default .NET allows only 2 connections per server. To change this you have to change the value of ServicePointManager.DefaultConnectionLimit to a larger value, eg 20 or 100.

This won't prevent flooding the server or consuming too much memory if you make too many requests though. A better option would be to use an ActionBlock< T> to buffer requests and send them in parallel in a controlled function, eg:

 ServicePointManager.DefaultConnectionLimit =20;

 var client = new HttpClient();

 var blockOptions=new ExecutionDataflowBlockOptions{MaxDegreeOfParallelism=10};

 var emailBlock=new ActionBlock<SendEmailParametersModel>(async arameters=>
     {
         var watch=new Stopwatch();
         var content = new StringContent(parameters, Encoding.UTF8, "application/json");
         var response = await client.PostAsync(url, content);
         _log.Info(..);
         watch.Restart();

         var responseString = await response.Result.Content.ReadAsStringAsync();
         _log.Info(...);
 });

Sending the emails doesn't require parallel invocation any more:

foreach(var recipientCollection in recipientCollections)
{
    var secondaryCountryRecipientList = string.Join(",",refinedCountryRecipients);
    var emailApiParams = new SendEmailParametersModel(CountrySubscriberApplicationId, queueItem.SitecoreId, queueItem.Version, queueItem.Language, countryFeedItem.Subject,countryFeedItem.Html, countryFeedItem.From, _recipientsFormatter.Format(secondaryCountryRecipientList));

   emailBlock.Post(emailApiParams);
   log.Info(...);
}
emailBlock.Complete();
await emailBlock.Completion();

HttpClient is thread-safe which allows you to use the same client for all requests.

The code above will buffer all requests and execute them 10 at a time. Calling Complete() tells the block to complete everything and stop processing new messages. await emailBlock.Completion() waits for all existing messages to finish before proceeding




回答2:


You are overloading the server. Parallel has no idea how many threads are optimal for your specific web service. You will get erratic results. In fact if the loop runs for a long time the thread count can rise into the hundreds and into the thousands (really!). Empirically determine the right DOP and fix the DOP.

When the service is overloaded it's not unusual to see very high servicing times. How else could it be? There's not enough capacity to do it quickly.

 var responseString = response.Result.Content.ReadAsStringAsync()

Here, you are missing a .Result call. The timing currently is off but this does not change the conclusion.

You also might be hitting the .NET concurrent request limit for HTTP calls. The default is 2.



来源:https://stackoverflow.com/questions/38221977/parallel-foreach-and-httpclient-strange-behaviour

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!