Perl Regex - Get offset of all the matches instead of one

主宰稳场 提交于 2019-12-12 11:06:00

问题


I want to search a file for a string and then get offsets for all the matches. The content of file is as below:

sometext
sometext
AAA
sometext
AAA
AAA
sometext

I am reading this whole file into a string $text and then doing a regex match for AAA as follows:

if($text =~ m/AAA/g) {
    $offset = $-[0];
}

This will give offset of only one AAA. How can I get offset of all the matches?

I know that we can get all matches in an array using syntax like this:

my @matches = ($text =~ m/AAA/g);

But I want offset not matched string.

Currently I am using following code to get offsets of all matches:

my $text= "sometextAAAsometextAAA";
my $regex = 'AAA';
my @matches = ();

while ($text =~ /($regex)/gi){
    my $match = $1;
    my $length = length($&);
    my $pos = length($`);
    my $start = $pos + 1;
    my $end = $pos + $length;
    my $hitpos = "$start-$end";
    push @matches, "$match found at $hitpos ";
}

print "$_\n" foreach @matches;

But is there a simpler way to to this?


回答1:


I don't think there's a built-in way to do this in Perl. But from How can I find the location of a regex match in Perl?:

sub match_all_positions {
    my ($regex, $string) = @_;
    my @ret;
    while ($string =~ /$regex/g) {
        push @ret, [ $-[0], $+[0] ];
    }
    return @ret
}



回答2:


You already know that you should use $-[0]! Replace

while ($text =~ /($regex)/gi){
    my $match = $1;
    my $length = length($&);
    my $pos = length($`);
    my $start = $pos + 1;
    my $end = $pos + $length;
    my $hitpos = "$start-$end";
    push @matches, "$match found at $hitpos ";
}

with

while ($text =~ /($regex)/gi){
    push @matches, "$1 found at $-[0]";
}

That said, I'm a big fan of separating calculations from output formatting, so I would do

while ($text =~ /($regex)/gi){
    push @matches, [ $1, $-[0] ];
}

PS — Unless you've unrolled a while loop, if (/.../g) makes no sense. At best, the /g does nothing. At worse, you get incorrect results.



来源:https://stackoverflow.com/questions/11439952/perl-regex-get-offset-of-all-the-matches-instead-of-one

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!