Regex / DOMDocument - match and replace text not in a link

后端 未结 7 1134
轮回少年
轮回少年 2020-12-01 07:06

I need to find and replace all text matches in a case insensitive way, unless the text is within an anchor tag - for example:

Match this text and re

7条回答
  •  误落风尘
    2020-12-01 07:31

    This is the stackless non-recursive approach using pre-order traversal of the DOM tree.

      libxml_use_internal_errors(TRUE);
      $dom=new DOMDocument('1.0','UTF-8');
    
      $dom->substituteEntities=FALSE;
      $dom->recover=TRUE;
      $dom->strictErrorChecking=FALSE;
    
      $dom->loadHTMLFile($file);
      $root=$dom->documentElement;
      $node=$root;
      $flag=FALSE;
      for (;;) {
          if (!$flag) {
              if ($node->nodeType==XML_TEXT_NODE &&
                  $node->parentNode->tagName!='a') {
                  $node->nodeValue=preg_replace(
                      '/match this text/is',
                      $replacement, $node->nodeValue
                  );
              }
              if ($node->firstChild) {
                  $node=$node->firstChild;
                  continue;
              }
         }
         if ($node->isSameNode($root)) break;
         if ($flag=$node->nextSibling)
              $node=$node->nextSibling;
         else
              $node=$node->parentNode;
     }
     echo $dom->saveHTML();
    

    libxml_use_internal_errors(TRUE); and the 3 lines of code after $dom=new DOMDocument; should be able to handle any malformed HTML.

提交回复
热议问题