Unknown url type error in urllib2

无人久伴 提交于 2019-12-06 03:28:45

It's hard to tell without seeing the HTML from the page that you are scraping, however, a stray ' (single quote) character at the beginning of the URL might be the cause - this causes the same exception:

>>> import urllib2
>>> urllib2.urlopen("'http://blah.com")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "urllib2.py", line 127, in urlopen
    return _opener.open(url, data, timeout)
  File "urllib2.py", line 404, in open
    response = self._open(req, data)
  File "urllib2.py", line 427, in _open
    'unknown_open', req)
  File "urllib2.py", line 382, in _call_chain
    result = func(*args)
  File "urllib2.py", line 1249, in unknown_open
    raise URLError('unknown url type: %s' % type)
urllib2.URLError: <urlopen error unknown url type: 'http>

So, try cleaning up your URL and remove any stray quotes.

Update after OP feedback:

The results of the print statement indicate that the URL has a single quote character at the beginning and end of the URL string. There should not any quotes of any type surrounding the URL when it is passed to urlopen(). You can remove leading and trailing quotes (both single and double) from the URL string with this:

url = url.strip('\'"')
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!