Parsing domain from a URL

前端 未结 18 2494
独厮守ぢ
独厮守ぢ 2020-11-22 12:26

I need to build a function which parses the domain from a URL.

So, with

http://google.com/dhasjkdas/sadsdds/sdda/sdads.html

or

18条回答
  •  生来不讨喜
    2020-11-22 12:38

    Here is the code i made that 100% finds only the domain name, since it takes mozilla sub tlds to account. Only thing you have to check is how you make cache of that file, so you dont query mozilla every time.

    For some strange reason, domains like co.uk are not in the list, so you have to make some hacking and add them manually. Its not cleanest solution but i hope it helps someone.

    //=====================================================
    static function domain($url)
    {
        $slds = "";
        $url = strtolower($url);
    
                $address = 'http://mxr.mozilla.org/mozilla-central/source/netwerk/dns/effective_tld_names.dat?raw=1';
        if(!$subtlds = @kohana::cache('subtlds', null, 60)) 
        {
            $content = file($address);
            foreach($content as $num => $line)
            {
                $line = trim($line);
                if($line == '') continue;
                if(@substr($line[0], 0, 2) == '/') continue;
                $line = @preg_replace("/[^a-zA-Z0-9\.]/", '', $line);
                if($line == '') continue;  //$line = '.'.$line;
                if(@$line[0] == '.') $line = substr($line, 1);
                if(!strstr($line, '.')) continue;
                $subtlds[] = $line;
                //echo "{$num}: '{$line}'"; echo "
    "; } $subtlds = array_merge(Array( 'co.uk', 'me.uk', 'net.uk', 'org.uk', 'sch.uk', 'ac.uk', 'gov.uk', 'nhs.uk', 'police.uk', 'mod.uk', 'asn.au', 'com.au', 'net.au', 'id.au', 'org.au', 'edu.au', 'gov.au', 'csiro.au', ),$subtlds); $subtlds = array_unique($subtlds); //echo var_dump($subtlds); @kohana::cache('subtlds', $subtlds); } preg_match('/^(http:[\/]{2,})?([^\/]+)/i', $url, $matches); //preg_match("/^(http:\/\/|https:\/\/|)[a-zA-Z-]([^\/]+)/i", $url, $matches); $host = @$matches[2]; //echo var_dump($matches); preg_match("/[^\.\/]+\.[^\.\/]+$/", $host, $matches); foreach($subtlds as $sub) { if (preg_match("/{$sub}$/", $host, $xyz)) preg_match("/[^\.\/]+\.[^\.\/]+\.[^\.\/]+$/", $host, $matches); } return @$matches[0]; }

提交回复
热议问题