.NET's Multi-threading vs Multi-processing: Awful Parallel.ForEach Performance

后端 未结 3 1900
旧巷少年郎
旧巷少年郎 2020-12-13 02:34

I have coded a very simple \"Word Count\" program that reads a file and counts each word\'s occurrence in the file. Here is a part of the code:

class Alaki
{         


        
3条回答
  •  心在旅途
    2020-12-13 03:30

    An attempt to explain the results:

    • a quick run in the VS profiler shows it's barely reaching 40% CPU utilization.
    • String.Split is the main hotspot.
    • so a shared something must be blocking the the CPU.
    • that something is most likely memory allocation. Your bottlenecks are
    var dic = new Dictionary>();
    ...
       dic[token].Add(1);
    

    I replaced this with

    var dic = new Dictionary();
    ...
    ... else dic[token] += 1;
    

    and the result is closer to a 2x speedup.

    But my counter question would be: does it matter? Your code is very artificial and incomplete. The parallel version ends up creating multiple dictionaries without merging them. This is not even close to a real situation. And as you can see, little details do matter.

    Your sample code is to complex to make broad statements about Parallel.ForEach().
    It is too simple to solve/analyze a real problem.

提交回复
热议问题