How do I programmatically add rel=“external” to external links in a string of HTML?

不打扰是莪最后的温柔 提交于 2019-12-08 11:28:32

问题


How can I check if links from a string variable are external? This string is the site content (like comments, articles etc).

And if they are, how do I append a external value to their rel attribute? And if they don't have this attribute, append rel="external" ?


回答1:


A HTML parser is appropriate for input filtering, but for modifying output you'll need the performance of a simpleminded regex solution. In this case a callback regex would do:

$html = preg_replace_callback("#<a\s[^>]*href="(http://[^"]+)"[^>]*>#",
     "cb_ext_url", $html);

function cb_ext_url($match) {
    list ($orig, $url) = $match;
    if (strstr($url, "http://localhost/")) {
        return $orig;
    }
    elseif (strstr($orig, "rel=")) {
        return $orig;
    }
    else {
        return rtrim($orig, ">") . ' rel="external">';
    }
}

You'll probably need more fine-grained checks. But that's the general approach.




回答2:


Use an XML parser, like SimpleXML. Regex isn't made to do XML/HTML parsing, and here's a perfect explanation of what happens when you do: RegEx match open tags except XHTML self-contained tags.

Parse the input as XML, use the parser to select the required elements, edit their properties using the parser, and spit them back out.

It'll save you a headache, as regex makes me cry...


Here's my way of doing this (didn't test it):

<?php

$xmlString = "This is where the HTML of your site should go. Make sure it's valid!";

$xml = new SimpleXMLElement($xmlString);

foreach($xml->getElementsByTagName('a') as $a)
{
  $attributes = $a->attributes();

  if (isThisExternal($attributes['href']))
  {
    $a['rel'] = 'external';
  }
}

echo $xml->asXml();

?>



回答3:


It might be easier to do something like this on the client side, using jQuery:

<script type="text/javascript">
    $(document).ready(function()
    {
        $.each($('a'), function(idx, tag)
        {
            // you might make this smarter and throw out URLS like 
            // http://www.otherdomain.com/yourdomain.com
            if ($(tag).attr('href').indexOf('yourdomain.com') < 0)
            {
                $(tag).attr('rel', 'external');
            }
        });
    });
</script>

As Craig White points out though, this doesn't do anything SEO-wise and won't help users who have JavaScript disabled.



来源:https://stackoverflow.com/questions/5608874/how-do-i-programmatically-add-rel-external-to-external-links-in-a-string-of-ht

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!