So I am using WWW::Mechanize
to crawl sites. It works great, except if I request a url such as:
http://www.levi.com/
I am red
use strict;
use warnings;
use URI;
use WWW::Mechanize;
my $url = 'http://...';
my $mech = WWW::Mechanize->new(autocheck => 0);
$mech->max_redirect(0);
$mech->get($url);
my $status = $mech->status();
if (($status >= 300) && ($status < 400)) {
my $location = $mech->response()->header('Location');
if (defined $location) {
print "Redirected to $location\n";
$mech->get(URI->new_abs($location, $mech->base()));
}
}
If the status code is 3XX, then you should check response headers for redirection url.
You can also get to the same place by inspecting the redirects()
method on the response object.
use strict;
use warnings;
use feature qw( say );
use WWW::Mechanize;
my $ua = WWW::Mechanize->new;
my $res = $ua->get('http://metacpan.org');
my @redirects = $res->redirects;
say 'request uri: ' . $redirects[-1]->request->uri;
say 'location header: ' . $redirects[-1]->header('Location');
Prints:
request uri: http://metacpan.org
location header: https://metacpan.org/
See https://metacpan.org/pod/HTTP::Response#$r-%3Eredirects Keep in mind that more than one redirect may have taken you to your current location. So you may want to inspect every response which is returned via redirects()
.