perl-mechanize runs into limitations - several debugging attempts started

孤街醉人 提交于 2019-12-08 15:28:56

问题


hello dear developers.

first of all - sorry for being the newbie.. i am pretty new to Perl.

i am trying to learn something about perl while playin around with code - and snippets. Today i have a little script that runs a mechanize job.. but somewhat does not run to the end. Waht is aimed: i want to get some thumbnails of wesite-sceenshots.

well i run this script , which is written to do some screenshots of websites i have also up and running mozrepl. whats strange is the output - see below... question: should i do change the script why do i ge the output?

#!/usr/bin/perl
use strict;
use warnings;
use WWW::Mechanize::Firefox;

my $mech = new WWW::Mechanize::Firefox();

open(INPUT, "<urls.txt") or die $!;

while (<INPUT>) {
        chomp;
        print "$_\n";
        $mech->get($_);
        my $png = $mech->content_as_png();
        my $name = "$_";
        $name =~s/^www\.//;
        $name .= ".png";
        open(OUTPUT, ">$name");
        print OUTPUT $png;
        sleep (5);
}

what the code gives badck is following

http://www.unifr.ch/sfm
print() on closed filehandle OUTPUT at test_3.pl line 20, <INPUT> line 2.
http://www.zug.phz.ch
print() on closed filehandle OUTPUT at test_3.pl line 20, <INPUT> line 3.
http://www.schwyz.phz.ch
print() on closed filehandle OUTPUT at test_3.pl line 20, <INPUT> line 4.
http://www.luzern.phz.ch
print() on closed filehandle OUTPUT at test_3.pl line 20, <INPUT> line 5.
http://www.schwyz.phz.ch
print() on closed filehandle OUTPUT at test_3.pl line 20, <INPUT> line 6.
http://www.phvs.ch
print() on closed filehandle OUTPUT at test_3.pl line 20, <INPUT> line 7.
http://www.phtg.ch
print() on closed filehandle OUTPUT at test_3.pl line 20, <INPUT> line 8.
http://www.phsg.ch
print() on closed filehandle OUTPUT at test_3.pl line 20, <INPUT> line 9.
http://www.phsh.ch
print() on closed filehandle OUTPUT at test_3.pl line 20, <INPUT> line 10.
http://www.phr.ch

what i have doen so fare to get rid of the issues: well i can use the diagnostics-pragma to get more insights into what is happening... Alternatively, print() on closed filehandle OUTPUT also gives us lots of answers that will tell us that we did not use autodie and also did not check the return value of open.

hmmm - well i just mused on the filehandle

well: the open call failed and since you assumed it was successful and proceeded to attempt to use the filehandle (which was not opened), you received that error.

The lesson here to learn is that we should ALWAYS check the return code of an open call to verify that it was successful and take proper action if it wasn't.

well - i guess that i have to learn here some perl-issues... I guess that i have to correct the code accordingly.

we should also take care and should use the 3 arg form of open and a lexical var for the filehandle.

hmm what about this one here. Code:

open my $out_fh, '>', $name or die "failed to create/open '$name' <$!>";

I just could build this part into the original code.. whatcha think?

#!/usr/bin/perl




use strict;
use warnings;
use WWW::Mechanize::Firefox;

my $mech = new WWW::Mechanize::Firefox();

open my $out_fh, '>', $name or die "failed to create/open '$name' <$!>";


open(INPUT, "<urls.txt") or die $!;

while (<INPUT>) {
        chomp;
        print "$_\n";
        $mech->get($_);
        my $png = $mech->content_as_png();
        my $name = "$_";
        $name =~s/^www\.//;
        $name .= ".png";
        open(OUTPUT, ">$name");
        print OUTPUT $png;
        sleep (5);
}

well what do you think?

how would you change the code - and make sure that the script will run successfully...


回答1:


For one thing, your input contains slashes and then you are trying to use that input to create a filename. Since your input begins with "http://www" and not "www", your substitution operation doesn't do anything, either.

my $name = "$_";            # e.g. $name <= "http://www.zug.phz.ch"
$name =~s/^www\.//;         # $name still is "http://www.zug.phz.ch"
$name .= ".png";            # $name is ""http://www.zug.phz.ch.png"
open(OUTPUT, ">$name");     # error: no directory named "./http:"
print OUTPUT $png;
sleep (5);

You'll want to do a better job of sanitizing your filename. Maybe something like

$name =~ s![:/]+!-!g; #http://foo.com/bar.html  becomes  http-foo.com-bar.html

And if anything, you return value you want to check is in the open call inside your while loop. If you had said

open(OUTPUT,">$name") or warn "Failed to open '$name': $!";

you probably would have figured this out on your own.



来源:https://stackoverflow.com/questions/9656892/perl-mechanize-runs-into-limitations-several-debugging-attempts-started

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!