Regex / php code to check if a URL is a short URL

99封情书 提交于 2019-12-06 02:58:45

I would not recommend to use regex, as it will be too complex and difficult to understand. Here is a PHP code to check all your constraints:

function _is_short_url($url){
        // 1. Overall URL length - May be a max of 30 charecters
        if (strlen($url) > 30) return false;

        $parts = parse_url($url);

        // No query string & no fragment
        if ($parts["query"] || $parts["fragment"]) return false;

        $path = $parts["path"];
        $pathParts = explode("/", $path);

        // 3. Number of '/' after protocol (http://) - Max 2
        if (count($pathParts) > 2) return false;

        // 2. URL length after last '/' - May be a max of 10 characters
        $lastPath = array_pop($pathParts);
        if (strlen($lastPath) > 10) return false;

        // 4. Max length of host
        if (strlen($parts["host"]) > 10) return false;

        return true;
}

Here is a small function which checks for all your requirements. I was able to check it without using a complex regex,... only preg_split. You should adapt it yourself easily.

<?php

var_dump(_isShortUrl('http://bit.ly/foo'));

function _isShortUrl($url)
{
    // Check for max URL length (30)
    if (strlen($url) > 30) {
        return false;
    }

    // Check, if there are more than two URL parts/slashes (5 splitted values)
    $parts = preg_split('/\//', $url);
    if (count($parts) > 5) {
        return false;
    }

    // Check for max host length (10)
    $host = $parts[2];
    if (strlen($host) > 10) {
        return false;
    }

    // Check for max length of last URL part (after last slash)
    $lastPart = array_pop($parts);
    if (strlen($lastPart) > 10) {
        return false;
    }

    return true;
}
dynamic

If I was you I would test if the url shows a 301 redirect, and then test if the redirect redirects to another website:

function _is_short_url($url) {
   $options['http']['method'] = 'HEAD';
   stream_context_set_default($options); # don't fetch the full page
   $headers = get_headers($url,1);
   if ( isset($headers[0]) ) {
     if (strpos($headers[0],'301')!==false && isset($headers['Location'])) {
       $location = $headers['Location'];
       $url = parse_url($url);
       $location = parse_url($location);
       if ($url['host'] != $location['host'])
         return true;
     }
   }

   return false;
}

echo (int)_is_short_url('http://bit.ly/1GoNYa');

Why not check if the host matches a known URL shortener. You cold get a list of most common url shorteners for example here.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!