What is the preferred solution for checking if an URL is relative or absolute?
Can't comment accepted answer, so write this comment as new answer: IMO checking scheme in accepted answer ( bool(urlparse.urlparse(url).scheme)
) is not really good idea because of http://example.com/file.jpg, https://example.com/file.jpg and //example.com/file.jpg are absolute urls but in last case we get scheme = ''
I use this code:
is_absolute = True if '//' in my_url else False
You can use the urlparse module to parse an URL and then you can check if it's relative or absolute by checking whether it has the host name set.
>>> import urlparse
>>> def is_absolute(url):
... return bool(urlparse.urlparse(url).netloc)
...
>>> is_absolute('http://www.example.com/some/path')
True
>>> is_absolute('//www.example.com/some/path')
True
>>> is_absolute('/some/path')
False
urlparse
has been moved to urllib.parse
, so use the following:
from urllib.parse import urlparse
def is_absolute(url):
return bool(urlparse(url).netloc)
Not sure what you're asking about.
Are you just looking to see if it begins with http://
?
IF so, a simple regex will do the trick.
(EDIT: See comment below -- a very good point!!)
If you want to know if an URL is absolute or relative in order to join it with a base URL, I usually do urlparse.urljoin
anyway:
>>> from urlparse import urljoin
>>> urljoin('http://example.com/', 'http://example.com/picture.png')
'http://example.com/picture.png'
>>> urljoin('http://example1.com/', '/picture.png')
'http://example1.com/picture.png'
>>>