Line Input operator with glob returning old values

不想你离开。 提交于 2021-01-27 19:15:52

问题


The following excerpt code, when running on perl 5.16.3 and older versions, has a strange behavior, where subsequent calls to a glob in the line input operator causes the glob to continue returning previous values, rather than running the glob anew.

#!/usr/bin/env perl

use strict;
use warnings;

my @dirs = ("/tmp/foo", "/tmp/bar");

foreach my $dir (@dirs) {    
    my $count = 0;
    my $glob = "*";
    print "Processing $glob in $dir\n";
    while (<$dir/$glob>) {
        print "Processing file $_\n";
        $count++;
        last if $count > 0;
    }
}

If you put two files in /tmp/foo and one or more in /tmp/bar, and run the code, I get the following output:

Processing * in /tmp/foo

Processing file /tmp/foo/foo.1

Processing * in /tmp/bar

Processing file /tmp/foo/foo.2

I thought that when the while terminates after the last, that the new invocation of the while on the second iteration would re-run the glob and give me the files listed /tmp/bar, but instead I get a continuation of what's in /tmp/foo.

It's almost like the angle operator glob is acting like a precompiled pattern. My hypothesis is that the angle operator is creating a filehandle in the symbol table that's still open and being reused behind the scenes, and that it's scoped to the containing foreach, or possibly the whole subroutine.


回答1:


From I/O Operators in perlop (my emphasis)

A (file)glob evaluates its (embedded) argument only when it is starting a new list. All values must be read before it will start over. In list context, this isn't important because you automatically get them all anyway. However, in scalar context the operator returns the next value each time it's called, or undef when the list has run out.

Since <> is called in scalar context here and you exit the loop with last after the first iteration, the next time you enter it it keeps reading from the original list.


It is clarified in comments that there is a practical need behind this quest: process only some of the files from a directory and never return all filenames since there can be many.

So assigning from glob to a list and working with it, or better yet using for instead of while as commented by ysth, doesn't help here as it returns a huge list.

I haven't found a way to make glob (what <> with a filename pattern uses) drop and rebuild the list once it's generated it, without getting to its end first. Apparently, each instance of the operator gets its own list. So using another <> inside the while loop with the hope of resetting it, in any way and even with the same pattern, doesn't affect the list being iterated over in while (<$glob>).

Just to note, breaking out of the loop with a die (with while in an eval) doesn't help either; the next time we come to that while the same list is continued. Wrapping it in a closure

sub iter_glob { my $dir = shift; return sub { scalar <"$dir/*"> } }

for my $d (@dirs) {
    my $iter = iter_glob($d);
    while (my $f = $iter->()) {
        # ...
    }
}

met with the same fate; the original list keeps being used.

The solution then is to use readdir instead.



来源:https://stackoverflow.com/questions/44856014/line-input-operator-with-glob-returning-old-values

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!