Multithreading task to process files in c#

后端 未结 4 1869
你的背包
你的背包 2021-01-18 12:07

I\'ve been reading a lot about threading but can\'t figure out how to find a solution to my issue. First let me introduce the problem. I have files which need to be processe

4条回答
  •  暗喜
    暗喜 (楼主)
    2021-01-18 12:40

    I was playing around with your problem and came up with the folllowing approach. It might not be the best, but I believe it suits your needs.

    Before we begin, I'm a big fan of extension methods, so here is one:

    public static class IEnumerableExtensions
    {
        public static void Each(this IEnumerable ie, Action action)
        {
            var i = 0;
            foreach (var e in ie) action(e, i++);
        }
    }
    

    What this does is looping over a collection (foreach) but keeping the item and the index. You'll see why this is needed later.

    Then we have the variables.

    public static string[] group_file_paths =
    {
        "host1", "host1", "host1", "host2", "host2", "host3", "host4", "host4",
        "host5", "host6"
    };
    
    public static string[] group_file_host_name =
    {
        @"c:\\host1_file1", @"c:\\host1_file2", @"c:\\host1_file3", @"c:\\host2_file1", @"c:\\host2_file2", @"c:\\host3_file1",
        @"c:\\host4_file1", @"c:\\host4_file2", @"c:\\host5_file1", @"c:\\host5_file2", @"c:\\host6_file1" 
    };
    

    Then the main code:

    public static void Main(string[] args)
    {
        Dictionary> filesToProcess = new Dictionary>();
    
        // Loop over the 2 arrays and creates a directory that contains the host as the key, and then all the filenames.
        group_file_paths.Each((host, hostIndex) =>
        {
            if (filesToProcess.ContainsKey(host))       
            { filesToProcess[host].Add(group_file_host_name[hostIndex]); }
            else
            {
                filesToProcess.Add(host, new List());
                filesToProcess[host].Add(group_file_host_name[hostIndex]);
            }
        });
    
        var tasks = new List();
    
        foreach (var kvp in filesToProcess)
        {
            tasks.Add(Task.Factory.StartNew(() => 
            {
                foreach (var file in kvp.Value)
                {
                    process_file(kvp.Key, file);
                }
            }));
        }
    
        var handleTaskCompletionTask = Task.WhenAll(tasks);
        handleTaskCompletionTask.Wait();
    }
    

    Some explanation might be needed here:

    So I'm creating a dictionary that will contains your hosts as the key and as the value a list of files that needs to be processed.

    Your dictionary will look like:

    • Host1
      • file 1
      • file 2
    • Host 2
      • file 1
    • Host 3
      • File 1
      • File 2
      • File 3

    After that I'm creating a collection of tasks that will be executed by using TPL. I execute all the tasks right now and I'm waiting for all the tasks to finish.

    Your process method seems as follow, just for testing purposes:

        public static void process_file(string host, string file)
        {
            var time_delay_random = new Random();
            Console.WriteLine("Host '{0}' - Started processing the file {1}.", host, file);
            Thread.Sleep(time_delay_random.Next(3000) + 1000);
            Console.WriteLine("Host '{0}' - Completed processing the file {1}.", host, file);
            Console.WriteLine("");
        }
    

    This post does not include a way to set the threads yourself but it can be easily achieved by using a completion handler on the tasks. Than when any task complete, you can loop again over your collection and start a new task that hasn't been finished yet.

    So, I hope it helps.

提交回复
热议问题