ValueError: unknown url type in urllib2, though the url is fine if opened in a browser

我与影子孤独终老i 提交于 2019-12-18 14:12:59

问题


Basically, I am trying to download a URL using urllib2 in python.

the code is the following:

import urllib2
req = urllib2.Request('www.tattoo-cover.co.uk')
req.add_header('User-agent','Mozilla/5.0')
result = urllib2.urlopen(req)

it outputs ValueError and the program crushes for the URL in the example. When I access the url in a browser, it works fine.

Any ideas how to handle the problem?

UPDATE:

thanks for Ben James and sth the problem is detected => add 'http://'

Now the question is refined: Is it possible to handle such cases automatically with some builtin function or I have to do error handling with subsequent string concatenation?


回答1:


When you enter a URL in a browser without the protocol, it defaults to HTTP. urllib2 won't make that assumption for you; you need to prefix it with http://.




回答2:


You have to use a complete URL including the protocol, not just specify a host name.

The correct URL would be http://www.tattoo-cover.co.uk/.




回答3:


You can use the method urlparse from urllib (Python 3) to check the presence of an addressing scheme (http, https, ftp) and concatenate the scheme in case it is not present:

In [1]: from urllib.parse import urlparse
    ..: 
    ..: url = 'www.myurl.com'
    ..: if not urlparse(url).scheme:
    ..:     url = 'http://' + url
    ..: 
    ..: url
Out[1]: 'http://www.myurl.com'



回答4:


You can use the urlparse function for that I think :

Python User Documentation



来源:https://stackoverflow.com/questions/5823572/valueerror-unknown-url-type-in-urllib2-though-the-url-is-fine-if-opened-in-a-b

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!