Python's urllib2 doesn't work on some sites

本小妞迷上赌 提交于 2019-12-01 11:29:25

I believe it gets blocked by the User-Agent. You can change User-Agent using the following sample code:

USERAGENT = 'something'
HEADERS = {'User-Agent': USERAGENT}

req = urllib2.Request(URL_HERE, headers=HEADERS)
f = urllib2.urlopen(req)
s = f.read()
f.close()
Jad

Try setting a different user agent. Check the answers in this link.

I'm the guy who posted the question. I have some suspicions - but not sure about them - that's why I posted the question here.

What is the cause of this issue?

I think its due to the host blocking the urllib library using robot.txt or htaccess. But not sure about it. Not even sure if its possible.

Any workaround for this issue?

If you are in Unix, this will work...

contents = commands.getoutput("curl -s '"+url+"'")
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!