Perl WWW::Mechanize (or LWP) get redirect url

后端 未结 2 1527
北恋
北恋 2020-12-17 23:08

So I am using WWW::Mechanize to crawl sites. It works great, except if I request a url such as:

http://www.levi.com/

I am red

相关标签:
2条回答
  • 2020-12-17 23:51
    use strict;
    use warnings;
    use URI;
    use WWW::Mechanize;
    
    my $url = 'http://...';
    my $mech = WWW::Mechanize->new(autocheck => 0);
    $mech->max_redirect(0);
    $mech->get($url);
    
    my $status = $mech->status();
    if (($status >= 300) && ($status < 400)) {
      my $location = $mech->response()->header('Location');
      if (defined $location) {
        print "Redirected to $location\n";
        $mech->get(URI->new_abs($location, $mech->base()));
      }
    }
    

    If the status code is 3XX, then you should check response headers for redirection url.

    0 讨论(0)
  • 2020-12-18 00:07

    You can also get to the same place by inspecting the redirects() method on the response object.

    use strict;
    use warnings;
    use feature qw( say );
    
    use WWW::Mechanize;
    
    my $ua = WWW::Mechanize->new;
    my $res = $ua->get('http://metacpan.org');
    
    my @redirects = $res->redirects;
    say 'request uri: ' . $redirects[-1]->request->uri;
    say 'location header: ' . $redirects[-1]->header('Location');
    

    Prints:

    request uri: http://metacpan.org
    location header: https://metacpan.org/
    

    See https://metacpan.org/pod/HTTP::Response#$r-%3Eredirects Keep in mind that more than one redirect may have taken you to your current location. So you may want to inspect every response which is returned via redirects().

    0 讨论(0)
提交回复
热议问题