问题
I have string consider
my $string = 'String need to be evaluated';
in $string I'm searching evaluated or any other word.
problem is their may be insertion of some tags in string
eg. Str<data>ing need to be eval<data>ua<data>ted which is unexpected.
In this case how could I search for the words?
here is the code I tried:
my $string = 'Text to be evaluated';
my $string2 = "Te<data>xt need to be eval<data2>ua<data>ted";
# patten to match
$pattern = "evaluated";
@b = split('',$pattern);
for my $i(@b){
$i="$i"."\(?:<data>\)?";
print "$i#\n";
}
$pattern = join('',@b);
print "\n$pattern\n";
if ($string2 =~ /$pattern/){
print "$pattern found\n";
}
Do you suggest any other method or module to make it easy? i don't know what kind of data will get inserted.
回答1:
Not sure if that is what you need but how about
@b = split('',$pattern);
for my $i(@b){
$i=$i.".*";
print "$i \n";
}
$pattern = join('',@b);
That should match any string that had the pattern before it got random insertions as long as the characters of the pattern are still there and in the correct order.
It does find evaluated in the string esouhgvw8vwrg355#*asrgl/\u[\w]atet(45)<data>efdvd what is about as noisy as it gets. But of course, if it is impossible to distinguish between insertion and original string, you will get "false" positives. For example if the string used to be evaluted and it becomes something like evalu<hereisyourmissinga>ted you will get a positive. Of course, if you knew that insertions would always be in tags while text is not, users answer is much safer.
As long as you single quote your input string, characters like [\w] (45) and whatnot should not hurt either. I cannot see why they would be interpolated at any point.
回答2:
Of course, you could use regexp to do the job:
foreach my $s ($string,$string2){
my $cs= $s;
### canonize
$cs =~ s!<[^>]*>!!gs;
### match
if ($cs =~ m!$pattern!i){
print "Found $pattern in $s!\n";
}
}
来源:https://stackoverflow.com/questions/21401964/whole-word-matching-with-unexpected-insertion-in-data