Regex to match words or phrases in string but NOT match if part of a URL or inside tags. (php)

后端 未结 7 1168
终归单人心
终归单人心 2020-12-06 23:20

I am aware that regex is not ideal for use with HTML strings and I have looked at the PHP Simple HTML DOM Parser but still believe this is the way to go. All the HTML tags w

7条回答
  •  再見小時候
    2020-12-06 23:43

    Joe, resurrecting this question because it had a simple solution that wasn't mentioned. (Found your question while doing some research for a general question about how to exclude patterns in regex.)

    With all the disclaimers about using regex to parse html, here is a simple way to do it.

    Here's our simple regex:

    (*SKIP)(*F)|amazon
    

    The left side of the alternation matches complete tags, then deliberately fails. The right side matches amazon, and we know this is the right amazon because it was not matched by the expression on the left.

    This program shows how to use the regex (see the results at the bottom of the online demo):

      word2 amazon";
    $regex = "~(?i)(*SKIP)(*F)|amazon~";
    $repl= 'Amazon';
    $new=preg_replace($regex,$repl,$target);
    echo htmlentities($new);
    

    Reference

    How to match (or replace) a pattern except in situations s1, s2, s3...

提交回复
热议问题