urllib2

Python's `urllib2`: Why do I get error 403 when I `urlopen` a Wikipedia page?

ぐ巨炮叔叔 提交于 2019-11-28 03:03:57
I have a strange bug when trying to urlopen a certain page from Wikipedia. This is the page: http://en.wikipedia.org/wiki/OpenCola_(drink) This is the shell session: >>> f = urllib2.urlopen('http://en.wikipedia.org/wiki/OpenCola_(drink)') Traceback (most recent call last): File "C:\Program Files\Wing IDE 4.0\src\debug\tserver\_sandbox.py", line 1, in <module> # Used internally for debug sandbox under external interpreter File "c:\Python26\Lib\urllib2.py", line 126, in urlopen return _opener.open(url, data, timeout) File "c:\Python26\Lib\urllib2.py", line 397, in open response = meth(req,

python requests is slow

对着背影说爱祢 提交于 2019-11-28 01:30:45
问题 I am developing a download manager. Using the requests module in python to check for a valid link (and hopefully broken links). My code for checking link below: url='http://pyscripter.googlecode.com/files/PyScripter-v2.5.3-Setup.exe' r = requests.get(url,allow_redirects=False) #this line takes 40 seconds if r.status_code==200: print "link valid" else: print "link invalid" Now, the issue is this takes approximately 40 seconds to perform this check, which is huge. My question is how can I speed

Python: saving large web page to file

大兔子大兔子 提交于 2019-11-28 01:29:36
问题 Let me start off by saying, I'm not new to programming but am very new to python. I've written a program using urllib2 that requests a web page that I would then like to save to a file. The web page is about 300KB, which doesn't strike me as particularly large but seems to be enough to give me trouble, so I'm calling it 'large'. I'm using a simple call to copy directly from the object returned from urlopen into the file: file.write(webpage.read()) but it will just sit for minutes, trying to

gaierror: [Errno -2] Name or service not known

拜拜、爱过 提交于 2019-11-28 00:52:00
问题 def make_req(data, url, method='POST') params = urllib.urlencode(data) headers = {"Content-type": "application/x-www-form-urlencoded", "Accept": "text/plain", } conn = httplib.HTTPSConnection(url) conn.request(method, url, params, headers) response = conn.getresponse() response_data = response.read() conn.close() But it is throwing: in create_connection for res in getaddrinfo(host, port, 0, SOCK_STREAM): gaierror: [Errno -2] Name or service not known What is the reason ? What is this error?

Does urllib2.urlopen() cache stuff?

不问归期 提交于 2019-11-27 23:31:27
问题 They didn't mention this in python documentation. And recently I'm testing a website simply refreshing the site using urllib2.urlopen() to extract certain content, I notice sometimes when I update the site urllib2.urlopen() seems not get the newly added content. So I wonder it does cache stuff somewhere, right? 回答1: So I wonder it does cache stuff somewhere, right? It doesn't. If you don't see new data, this could have many reasons. Most bigger web services use server-side caching for

Throttling with urllib2

泪湿孤枕 提交于 2019-11-27 22:21:53
is it possible to easily cap the kbps when using urllib2 ? If it is, any code examples or resources you could direct me to would be greatly appreciated. There is the urlretrieve(url, filename=None, reporthook=None, data=None) function in the urllib module. If you implement the reporthook -function/object as either a token bucket , or a leaky bucket, you have your global rate-limit. EDIT: Upon closer examination I see that it isn't as easy to do global rate-limit with reporthook as I thought. reporthook is only given the downloaded amount and the total size, which on their own isn't enough to

Using client certificates with urllib2

徘徊边缘 提交于 2019-11-27 20:32:16
问题 I need to create a secure channel between my server and a remote web service. I'll be using HTTPS with a client certificate. I'll also need to validate the certificate presented by the remote service. How can I use my own client certificate with urllib2? What will I need to do in my code to ensure that the remote certificate is correct? 回答1: Here's a bug in the official Python bugtracker that looks relevant, and has a proposed patch. 回答2: Because alex's answer is a link, and the code on that

How to send a POST request using django?

对着背影说爱祢 提交于 2019-11-27 20:04:49
I dont want to use html file, but only with django I have to make POST request. Just like urllib2 sends a get request. A combination of methods from urllib2 and urllib will do the trick. Here is how I post data using the two: post_data = [('name','Gladys'),] # a sequence of two element tuples result = urllib2.urlopen('http://example.com', urllib.urlencode(post_data)) content = result.read() urlopen() is a method you use for opening urls. urlencode() converts the arguments to percent-encoded string. Here's how you'd write the accepted answer's example using python-requests : post_data = {'name'

How to make HTTP DELETE method using urllib2?

笑着哭i 提交于 2019-11-27 19:15:31
Does urllib2 support DELETE or PUT method? If yes provide with any example please. I need to use piston API. Corey Goldberg you can do it with httplib : import httplib conn = httplib.HTTPConnection('www.foo.com') conn.request('PUT', '/myurl', body) resp = conn.getresponse() content = resp.read() also, check out this question . the accepted answer shows a way to add other methods to urllib2: import urllib2 opener = urllib2.build_opener(urllib2.HTTPHandler) request = urllib2.Request('http://example.org', data='your_put_data') request.add_header('Content-Type', 'your/contenttype') request.get

How do I download a zip file in python using urllib2?

拟墨画扇 提交于 2019-11-27 19:01:49
Two part question. I am trying to download multiple archived Cory Doctorow podcasts from the internet archive. The old one's that do not come into my iTunes feed. I have written the script but the downloaded files are not properly formatted. Q1 - What do I change to download the zip mp3 files? Q2 - What is a better way to pass the variables into URL? # and the base url. def dlfile(file_name,file_mode,base_url): from urllib2 import Request, urlopen, URLError, HTTPError #create the url and the request url = base_url + file_name + mid_url + file_name + end_url req = Request(url) # Open the url