How to properly parallelise job heavily relying on I/O

后端 未结 5 1493
耶瑟儿~
耶瑟儿~ 2020-12-02 15:04

I\'m building a console application that have to process a bunch of data.

Basically, the application grabs references from a DB. For each reference, parse the conten

5条回答
  •  感动是毒
    2020-12-02 15:19

    Your best bet in these kind of scenario is definitely the producer-consumer model. One thread to pull the data and a bunch of workers to process it. There's no easy way around the I/O so you might as well just focus on optimizing the computation itself.

    I will now try to sketch a model:

    // producer thread
    var refs = GetReferencesFromDB(); // ~5000 Datarow returned
    
    foreach(var ref in refs)
    {
        lock(queue)
        {   
           queue.Enqueue(ref);
           event.Set();
        }
    
        // if the queue is limited, test if the queue is full and wait.
    }
    
    // consumer threads
    while(true)
    {
        value = null;
        lock(queue)
        {
           if(queue.Count > 0)
           {
               value = queue.Dequeue();
           }
        }        
    
        if(value != null) 
           // process value
        else        
           event.WaitOne(); // event to signal that an item was placed in the queue.           
    }
    

    You can find more details about producer/consumer in part 4 of Threading in C#: http://www.albahari.com/threading/part4.aspx

提交回复
热议问题