Here\'s the deal. Is there a way to have strings tokenized in a line based on multiple regexes?
One example:
I have to get all href tags, their corresponding
If you're specifically after parsing links out of web-pages, then Perl's WWW::Mechanize module will figure things out for you in a very elegant fashion. Here's a sample program that grabs the first page of Stack Overflow and parses out all the links, printing their text and corresponding URLs:
#!/usr/bin/perl
use strict;
use warnings;
use WWW::Mechanize;
my $mech = WWW::Mechanize->new;
$mech->get("http://stackoverflow.com/");
$mech->success or die "Oh no! Couldn't fetch stackoverflow.com";
foreach my $link ($mech->links) {
print "* [",$link->text, "] points to ", $link->url, "\n";
}
In the main loop, each $link is a WWW::Mechanize::Link object, so you're not just constrained to getting the text and URL.
All the best,
Paul