What is the purpose of TWorkStealingQueue and how to use it?

给你一囗甜甜゛ 提交于 2021-02-20 06:13:35

问题


I am working on a program to migrate files from potentially big directory structures and many of them (approx. 1 million).
My migration code already works quite well, and I am using a class to iterate to the directory structure, identify the files to migrate them sequentially one after another.

Now I want to make better use of the available CPU resources of the targeted machine, and do those migrations asynchronously grabbing threads from a System.Threading.TThreadPool to execute these.

I know well about the ITask interface, and how to make use of TTask to set up an array of tasks, that will be managed in conjunction with a TThreadPool instance.
Though setting up a big TArray<ITask> array, and waiting for completion when all the directories were walked through, just seems to be an inappropriate and inefficient approach (especially in regards of memory consumption).

What I believe I need there is just to have a simple thread safe producer / consumer queue, that grows and shrinks as worker threads are available to consume the tasks, and complete them.

Now I found something that sounds promising these regards at the Emba docs, called a TWorkStealingQueue, but as so often, the documentation is pretty poor and lacks concise examples how to make use of it.

It would boil down to something like that

TMigrationFileWalker = class(TFileWalker) 
strict private
    var
       FPendingMigrationTasks : TArray<ITask>;

    function createMigrationTask(const filename : string) : ITask;
strict protected
    procedure onHandleFile(const filename : string); override;
public
    procedure walkDirectoryTree(const startDir : string); override;
end;

implementation

procedure TMigrationFileWalker.onHandleFile(const filename : string);
var
    migrationTask : ITask;
begin
    migrationTask := createMigrationTask(filename);
    self.FPendingMigrationTasks := self.FPendingMigrationTasks + [migrationTask];
    migrationTask.Start();
end;

procedure walkDirectoryTree(const startDir : string);
begin
    inherited walkDirectoryTree(startDir);
    TTask.WaitForAll(self.FPendingMigrationTasks,SOME_REASONABLE_TIMEOUT);
end;

Of course I could have a thread safe PC queue, and manage a bunch of threads working on it. But the promise there is it works with a thread pool, and I'd like to take advantage of the already available load balancing mechanisms coming with it.

Is anyone around here who already used TWorkStealingQueue, and can give a short, concise example how that could be used in such scenario as described above? Or at least clarify what's the actual purpose of that class, in case I totally misunderstood that from the naming?

A research about TWorkStealingQueue didn't yield any better results, than redirecting to the insufficient Embarcadero documentation.

来源:https://stackoverflow.com/questions/51571759/what-is-the-purpose-of-tworkstealingqueue-and-how-to-use-it

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!