How do you strip out the domain name from a URL in php?

后端 未结 9 1237
遥遥无期
遥遥无期 2020-12-02 22:12

Im looking for a method (or function) to strip out the domain.ext part of any URL thats fed into the function. The domain extension can be anything (.com, .co.uk, .nl, .what

相关标签:
9条回答
  • 2020-12-02 22:48

    You can also write a regular expression to get exactly what you want.

    Here is my attempt at it:

    $pattern = '/\w+\..{2,3}(?:\..{2,3})?(?:$|(?=\/))/i';
    $url = 'http://www.example.com/foo/bar?hat=bowler&accessory=cane';
    if (preg_match($pattern, $url, $matches) === 1) {
        echo $matches[0];
    }
    

    The output is:

    example.com
    

    This pattern also takes into consideration domains such as 'example.com.au'.

    Note: I have not consulted the relevant RFC.

    0 讨论(0)
  • 2020-12-02 22:50

    This function should work:

    function Delete_Domain_From_Url($Url = false)
    {
        if($Url)
        {
            $Url_Parts = parse_url($Url);
            $Url = isset($Url_Parts['path']) ? $Url_Parts['path'] : '';
            $Url .= isset($Url_Parts['query']) ? "?".$Url_Parts['query'] : '';
        }
    
        return $Url;
    }
    

    To use it:

    $Url = "https://stackoverflow.com/questions/176284/how-do-you-strip-out-the-domain-name-from-a-url-in-php";
    echo Delete_Domain_From_Url($Url);
    
    # Output: 
    #/questions/176284/how-do-you-strip-out-the-domain-name-from-a-url-in-php
    
    0 讨论(0)
  • 2020-12-02 22:52

    Here are a couple simple functions to get the root domain (example.com) from a normal or long domain (test.sub.domain.com) or url (http://www.example.com).

    /**
     * Get root domain from full domain
     * @param string $domain
     */
    public function getRootDomain($domain)
    {
        $domain = explode('.', $domain);
    
        $tld = array_pop($domain);
        $name = array_pop($domain);
    
        $domain = "$name.$tld";
    
        return $domain;
    }
    
    /**
     * Get domain name from url
     * @param string $url
     */
    public function getDomainFromUrl($url)
    {
        $domain = parse_url($url, PHP_URL_HOST);
        $domain = $this->getRootDomain($domain);
    
        return $domain;
    }
    
    0 讨论(0)
  • 2020-12-02 22:57

    There is only one correct way to extract domain parts, it's use Public Suffix List (database of TLDs). I recomend TLDExtract package, here is sample code:

    $extract = new LayerShifter\TLDExtract\Extract();
    
    $result = $extract->parse('www.domain.com/path/script.php?=whatever');
    $result->getSubdomain(); // will return (string) 'www'
    $result->getHostname(); // will return (string) 'domain'
    $result->getSuffix(); // will return (string) 'com'
    
    0 讨论(0)
  • 2020-12-02 22:58

    You can use parse_url() to do this:

    $url = 'http://www.example.com';
    $domain = parse_url($url, PHP_URL_HOST);
    $domain = str_replace('www.','',$domain);
    

    In this example, $domain should contain example.com, irrespective of it having www or not. It also works for a domain such as .co.uk

    0 讨论(0)
  • 2020-12-02 23:01

    I spent some time thinking about whether it makes sense to use a regular expression for this, but in the end I think not.

    firstresponder's regexp came close to convincing me it was the best way, but it didn't work on anything missing a trailing slash (so http://example.com, for instance). I fixed that with the following: '/\w+\..{2,3}(?:\..{2,3})?(?=[\/\W])/i', but then I realized that matches twice for urls like 'http://example.com/index.htm'. Oops. That wouldn't be so bad (just use the first one), but it also matches twice on something like this: 'http://abc.ed.fg.hij.kl.mn/', and the first match isn't the right one. :(

    A co-worker suggested just getting the host (via parse_url()), and then just taking the last two or three array bits (split() on '.') The two or three would be based on a list of domains, like 'co.uk', etc. Making up that list becomes the hard part.

    0 讨论(0)
提交回复
热议问题