Perl run the same script for different directories at the same time

南笙酒味 提交于 2019-12-11 17:22:27

问题


I have a directory that contains other directories (the number of directories is arbitrary), like this:

Main_directory_samples/

  • subdirectory_sample_1/
  • subdirectory_sample_2/
  • subdirectory_sample_3/
  • subdirectory_sample_4/

I have a script that receives as input one directory each time and it takes 1h to run (for each directory). To run the script I have the following code:

opendir DIR, $maindirectory or die "Can't open directory!!";
while(my $dir = readdir DIR){
    if($dir ne '.' && $dir ne '..'){ 
        system("/bin/bash", "my_script.sh", $maindirectory.'/'.$dir);    
    }   
}
closedir DIR;

However, I want to run the script for different directories at the same time. For instance, the subdirectory_sample_1/ and subdirectory_sample_2/ would run in the same thread; subdirectory_sample_3/ and subdirectory_sample_4/ in another. But I just can't find a way to do this.


回答1:


As you're just starting external processes and waiting for them, a non-threading option:

use strict;
use warnings;
use Path::Tiny;
use IO::Async::Loop;
use Future::Utils 'fmap_concat';

my $loop = IO::Async::Loop->new;

my $maindirectory = '/foo/bar';
my @subdirs = grep { -d } path($maindirectory)->children; # excludes . and ..

# runs this code to maintain up to 'concurrent' pending futures at once
my $main_future = fmap_concat {
  my $dir = shift;
  my $future = $loop->new_future;
  my $process = $loop->open_process(
    command => ['/bin/bash', 'my_script.sh', $dir],
    on_finish => sub { $future->done(@_) },
    on_exception => sub { $future->fail(@_) },
  );
  return $future;
} foreach => \@subdirs, concurrent => 2;

# run event loop until all futures are done or one fails, throw exception on failure
my @exit_codes = $main_future->get;

See the docs for IO::Async::Loop and Future::Utils.




回答2:


One way is to fork and in each child process a group of directories.

A basic example

use warnings;
use strict;
use feature 'say';

use List::MoreUtils qw(natatime);
use POSIX qw(:sys_wait_h);  # for WNOHANG
use Time::HiRes qw(sleep);  # for fractional seconds

my @all_dirs = qw(d1 d2 d3 d4);
my $path = 'maindir';
my @procs;

# Get iterator over groups (of 2)
my $it = natatime 2, @all_dirs;

while (my @dirs = $it->()) { 

    my $pid = fork // do {      #/
        warn "Can't fork for @dirs: $!";
        next;
    };

    if ($pid == 0) { 
        foreach my $dir (@dirs) {
            my @cmd = ('/bin/bash/', 'my_script.sh', "$path/$dir");
            say "in $$, \@cmd: (@cmd)"; 
            # system(@cmd) == 0 or do { inspect $? }
        };  
        exit;
    };  
    push @procs, $pid;
}

# Poll with non-blocking wait for processes (reap them)
my $gone;
while (($gone = waitpid -1, WNOHANG) > -1) {
    my $status = $?; 
    say "Process $gone exited with $status" if $gone > 0;
    sleep 0.1;
}

See system and/or exec for details, in particular on error checking, as well as $? variable. It can be unpacked to retrieve more details about the error; or, at least print a warning and skip to the next item (which happens above anyway).

The code above prints out the command and pid's with their exit status, but replace @cmd with a test command of no consequence and un-comment the system line to try this out.

Watch for how many jobs there are. A basic rule of thumb is to not have more than 2 per core at which point the performance starts suffering, but this depends on many details. Experiment to find the sweet spot for your case. I like to have a job per core and then at least one core free. In order to throttle this see modules linked at the end.

To break all jobs (directories) into groups I used natatime from List::MoreUtils (n-at-a-time). If there are more specific criteria about how to group directories adjust that.

See Forks::Super and Parallel::ForkManager for higher-level ways to work with forked processes.



来源:https://stackoverflow.com/questions/52612742/perl-run-the-same-script-for-different-directories-at-the-same-time

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!