finding many thousands of files in a directory pattern in Perl

做~自己de王妃 提交于 2019-12-11 04:39:34

问题


I would like to find a file pattern on a directory pattern in Perl that will return many thousands of entries, like this:

find ~/mydir/*/??/???/???? -name "\*.$refinfilebase.search" -print

I've been told there are different ways to handle it? I.e.:

File::Find
glob()
opendir, readdir, grep
Diamond operator, e.g.: my @files = <$refinfilebase.search>

Which one would be most adequate to be able to run the script on older versions of Perl or minimal installations of Perl?


回答1:


Also you have DirHandle

DirHandle:

use DirHandle;
$d = new DirHandle ".";
if (defined $d) {
    while (defined($_ = $d->read)) { something($_); }
    $d->rewind;
    while (defined($_ = $d->read)) { something_else($_); }
    undef $d;
}

For use cases of readdir and glob see What reasons are there to prefer glob over readdir (or vice-versa) in Perl?

I prefer to use glob for quickly grab a list of files in a dir (no subdirs) and process them like

map{process_bam($_)} glob(bam_files/*.bam)

This is more convenient because it does not take the . and .. even is you ask for (*) and also returns the full path if you use a dir in the glob pattern.

Also you can use glob quickly as a oneliner piped to xargs or in a bash for loop when you need to preprocess the filenames of the list:

perl -lE 'print join("\n", map {s/srf\/(.+).srf/$1/;$_} glob("srf/198*.srf"))' | xargs -n 1.....

Readdir has adventages in other scenarios so you need to use the one that fits better for your actions.




回答2:


For very large directories, opendir() is probably safest, as it doesn't need to read everything in or do any filtering on it. This can be faster as the ordering isn't important, and on very large directories, on some operating systems, this can be a performance hit. opendir is also built-in with all systems.

Note the actual way it behaves may be different on different platforms. So you need to be careful in coding with it. This mainly affects which it returns for things like the parent and current directory, which you may need to treat specially.

glob() is more useful when you only want some files, matching by a pattern. File::Find is more useful when recursing through a set of nested directories. If you don't need either, opendir() is a good base.



来源:https://stackoverflow.com/questions/6614085/finding-many-thousands-of-files-in-a-directory-pattern-in-perl

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!