How can I escape meta-characters when I interpolate a variable in Perl's match operator?

天涯浪子 提交于 2019-11-30 11:57:17

The correct answer is - don't use regexps. I'm not saying regexps are bad, but using them for (what equals to) simple equality check is overkill.

Use: grep { lc($_) eq lc($word) } @arr and be happy.

Use \Q...\E to escape special symbols directly in perl string after variable value interpolation:

if( grep {$_ =~ m/^\Q$word\E$/i} @arr)

From perlfaq6's answer to How do I match a regular expression that's in a variable?:


We don't have to hard-code patterns into the match operator (or anything else that works with regular expressions). We can put the pattern in a variable for later use.

The match operator is a double quote context, so you can interpolate your variable just like a double quoted string. In this case, you read the regular expression as user input and store it in $regex. Once you have the pattern in $regex, you use that variable in the match operator.

chomp( my $regex = <STDIN> );

if( $string =~ m/$regex/ ) { ... }

Any regular expression special characters in $regex are still special, and the pattern still has to be valid or Perl will complain. For instance, in this pattern there is an unpaired parenthesis.

my $regex = "Unmatched ( paren";

"Two parens to bind them all" =~ m/$regex/;

When Perl compiles the regular expression, it treats the parenthesis as the start of a memory match. When it doesn't find the closing parenthesis, it complains:

Unmatched ( in regex; marked by <-- HERE in m/Unmatched ( <-- HERE  paren/ at script line 3.

You can get around this in several ways depending on our situation. First, if you don't want any of the characters in the string to be special, you can escape them with quotemeta before you use the string.

chomp( my $regex = <STDIN> );
$regex = quotemeta( $regex );

if( $string =~ m/$regex/ ) { ... }

You can also do this directly in the match operator using the \Q and \E sequences. The \Q tells Perl where to start escaping special characters, and the \E tells it where to stop (see perlop for more details).

chomp( my $regex = <STDIN> );

if( $string =~ m/\Q$regex\E/ ) { ... }

Alternately, you can use qr//, the regular expression quote operator (see perlop for more details). It quotes and perhaps compiles the pattern, and you can apply regular expression flags to the pattern.

chomp( my $input = <STDIN> );

my $regex = qr/$input/is;

$string =~ m/$regex/  # same as m/$input/is;

You might also want to trap any errors by wrapping an eval block around the whole thing.

chomp( my $input = <STDIN> );

eval {
    if( $string =~ m/\Q$input\E/ ) { ... }
    };
warn $@ if $@;

Or...

my $regex = eval { qr/$input/is };
if( defined $regex ) {
    $string =~ m/$regex/;
    }
else {
    warn $@;
    }

Quotemeta

Returns the value of EXPR with all non-"word" characters backslashed.

http://perldoc.perl.org/functions/quotemeta.html

I don't think you want a regex in this case since you aren't matching a pattern. You're looking for a literal sequence of characters that you already know. Build a hash with the values to match and use that to filter @arr:

 open my $fh, '<', $filename or die "...";
 my %hash = map { chomp; lc($_), 1 } <$fh>;

 foreach my $item ( @arr ) 
      {
      next unless exists $hash{ lc($item) };
      print "I matched [$item]\n";
      }
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!