Beginner Regex: Multiple Replaces

问题

I have a string:

$mystring = "My cat likes to eat tomatoes.";

I want to do two replacements on this string with regex. I want to do s/cat/dog/ and s/tomatoes/pasta/. However, I don't know how to properly format the regular expression to do the multiple replacements in one expression, on one line, in one declaration. Right now, all I have is:

$mystring =~ s/cat/dog/ig;
$mystring =~ s/tomatoes/pasta/ig;

回答1:

My suggestion is you do this

my $text               =  'My cat likes to eat tomatoes.';
my ( $format = $text ) =~ s/\b(cat|tomatoes)\b/%s/g;

Then you can just do this:

my $new_sentence = sprintf( $format, 'dog', 'pasta' );

As well as this:

$new_sentence    = sprintf( $format, 'tiger', 'asparagus' );

I go with the others. You shouldn't want to do it all in one expression, or one line...but here is a way:

$text =~ s/\b(cat|tomatoes)\b/ ${{ qw<cat dog tomatoes pasta> }}{$1} /ge;

回答2:

As usual, use a hash as a lookup table, match keys, replace with values:

#!/usr/bin/perl

use strict;
use warnings;

use Regex::PreSuf;

my %repl = (
    cat => 'dog',
    tomatoes => 'pasta',
);

my $string = "My cat likes to eat tomatoes.";
my $re = presuf( keys %repl );

$string =~ s/($re)/$repl{$1}/ig;

print $string, "\n";

Output:

C:\Temp> t
My dog likes to eat pasta.

You could also use a loop:

for my $k ( keys %repl ) {
    $string =~ s/\Q$k/$repl{$k}/ig;
}

回答3:

Why would you want to?

I know some Perl-ers pride themselves for being able to write some of the most obfuscated code imaginable (see some of the code-golf questions on here), but that doesn't make it a smart thing to do.

Keep it readable, and just keep it like this you'll thank yourself in the long run.

EDIT:

Certainly, if you are looking at 5 or more replacements, please (for the mother of god) use some kind of lookup table. But DO NOT try to write one massive regex that does it all.

回答4:

If the things you're looking for are regular expressions themselves, a direct lookup table as perl @Sinan Ünür won't work (as the string equality 123 eq '\d+' fails).

You can use Regexp::Assemble to get around this limitation:

use strict;
use warnings;
use Regexp::Assemble;

my %replace = (
    'cat' => 'dog',
    '(?:tom|pot)atoes' => 'pasta',
);
my $re = Regexp::Assemble->new->track(1)->add(keys %replace);

my $str = 'My cat likes to eat tomatoes.';
while (my $m = $re->match($str)) {
    $str =~ s/$m/$replace{$m}/;
}
print $str, $/;

$str = 'My cat likes to eat potatoes.';
while (my $m = $re->match($str)) {
    $str =~ s/$m/$replace{$m}/;
}
print $str, $/;

Both of these blocks produces My dog likes to eat pasta.

回答5:

One very rudimentary way to perform multiple substitutions in a single line would be to match the text with groupings. This will not allow you to find all instances of "cat" and replace it with "dog", but it will get you to "My dog likes to eat pasta"

$mystring =~ s/(.*)cat(.*)tomatoes(.*)/$1dog$2pasta$3/g;

回答6:

You can do this the quick and dirty way, or the quick and clean way:

In both cases you need a hash word => replacement

With the quick and dirty way, you then build the left part of the substitution by joining the keys of the hash with a '|'. In order to deal with overlapping words (eg 'cat' and 'catogan') you need to place the longest option first, by doing a sort reverse on the keys of the hash. You still can't deal with meta-characters in the words to replace (eg 'cat++').

The quick and clean way uses Regexp::Assemble to build the left part of the regexp. It deals natively with overlapping words, and it is simple to get it to deal with meta-characters in the words to replace.

Once you have the word to replace, you then replace it with the corresponding entry in the hash.

Below is a bit of code that shows the 2 methods, dealing with various cases:

#!/usr/bin/perl

use strict;
use warnings;

use Test::More tests => 6;

use Regexp::Assemble;

my $mystring = "My cat likes to eat tomatoes.";
my $expected = "My dog likes to eat pasta.";

my $repl;

# simple case
$repl= { 'cat' => 'dog', 'tomatoes' => 'pasta', };

is( 
    repl_simple($mystring, $repl), 
    $expected, 
    'look Ma, no module (simple)'
);  

my $re= regexp_assemble($repl);
is( 
    repl_assemble($mystring, $re), 
    $expected, 
    'with Regex::Assemble (simple)'
);

# words overlap
$mystring = "My cat (catogan) likes to eat tomatoes.";
$expected = "My dog (doggie) likes to eat pasta.";

$repl= {'cat' => 'dog', 'tomatoes' => 'pasta', 'catogan'  => 'doggie', };

is( 
    repl_simple($mystring, $repl), 
    $expected, 
    'no module, words overlap'
);  

$re= regexp_assemble( $repl);
is( 
     repl_assemble($mystring, $re), 
     $expected, 
     'with Regex::Assemble, words overlap'
);


# words to replace include meta-characters
$mystring = "My cat (felines++) likes to eat tomatoes.";
$expected = "My dog (wolves--) likes to eat pasta.";

$repl= {'cat' => 'dog', 'tomatoes' => 'pasta', 'felines++' => 'wolves--', };

is( 
    repl_simple($mystring, $repl), 
    $expected, 
    'no module, meta-characters in expression'
);  

$re= regexp_assemble( $repl);
is( 
    repl_assemble($mystring, $re), 
    $expected, 
    'with Regex::Assemble, meta-characters in expression'
);

sub repl_simple { 
    my( $string, $repl)= @_;
    my $alternative= join( '|', reverse sort keys %$repl);
    $string=~ s{($alternative)}{$repl->{$1}}ig;
    return $string;
  }


sub regexp_assemble { 
    my( $repl)= @_;
    my $ra = Regexp::Assemble->new;
    foreach my $alt (keys %$repl)
      { $ra->add( '\Q' . $alt . '\E'); }
    return $ra->re;
  } 

sub repl_assemble { 
    my( $string, $re)= @_;
    $string=~ s{($re)}{$repl->{$1}}ig;
    return $string;
  }

来源：https://stackoverflow.com/questions/1476290/beginner-regex-multiple-replaces

标签

regex

perl

replace