urllib2 | 易学教程

Python's `urllib2`: Why do I get error 403 when I `urlopen` a Wikipedia page?

阅读更多关于 Python's `urllib2`: Why do I get error 403 when I `urlopen` a Wikipedia page?

I have a strange bug when trying to urlopen a certain page from Wikipedia. This is the page: http://en.wikipedia.org/wiki/OpenCola_(drink) This is the shell session: >>> f = urllib2.urlopen('http://en.wikipedia.org/wiki/OpenCola_(drink)') Traceback (most recent call last): File "C:\Program Files\Wing IDE 4.0\src\debug\tserver\_sandbox.py", line 1, in <module> # Used internally for debug sandbox under external interpreter File "c:\Python26\Lib\urllib2.py", line 126, in urlopen return _opener.open(url, data, timeout) File "c:\Python26\Lib\urllib2.py", line 397, in open response = meth(req,

python requests is slow

阅读更多关于 python requests is slow

问题 I am developing a download manager. Using the requests module in python to check for a valid link (and hopefully broken links). My code for checking link below: url='http://pyscripter.googlecode.com/files/PyScripter-v2.5.3-Setup.exe' r = requests.get(url,allow_redirects=False) #this line takes 40 seconds if r.status_code==200: print "link valid" else: print "link invalid" Now, the issue is this takes approximately 40 seconds to perform this check, which is huge. My question is how can I speed

Python: saving large web page to file

阅读更多关于 Python: saving large web page to file

问题 Let me start off by saying, I'm not new to programming but am very new to python. I've written a program using urllib2 that requests a web page that I would then like to save to a file. The web page is about 300KB, which doesn't strike me as particularly large but seems to be enough to give me trouble, so I'm calling it 'large'. I'm using a simple call to copy directly from the object returned from urlopen into the file: file.write(webpage.read()) but it will just sit for minutes, trying to

gaierror: [Errno -2] Name or service not known

阅读更多关于 gaierror: [Errno -2] Name or service not known

问题 def make_req(data, url, method='POST') params = urllib.urlencode(data) headers = {"Content-type": "application/x-www-form-urlencoded", "Accept": "text/plain", } conn = httplib.HTTPSConnection(url) conn.request(method, url, params, headers) response = conn.getresponse() response_data = response.read() conn.close() But it is throwing: in create_connection for res in getaddrinfo(host, port, 0, SOCK_STREAM): gaierror: [Errno -2] Name or service not known What is the reason ? What is this error?

Does urllib2.urlopen() cache stuff?

阅读更多关于 Does urllib2.urlopen() cache stuff?

问题 They didn't mention this in python documentation. And recently I'm testing a website simply refreshing the site using urllib2.urlopen() to extract certain content, I notice sometimes when I update the site urllib2.urlopen() seems not get the newly added content. So I wonder it does cache stuff somewhere, right? 回答1: So I wonder it does cache stuff somewhere, right? It doesn't. If you don't see new data, this could have many reasons. Most bigger web services use server-side caching for

Throttling with urllib2

阅读更多关于 Throttling with urllib2

is it possible to easily cap the kbps when using urllib2 ? If it is, any code examples or resources you could direct me to would be greatly appreciated. There is the urlretrieve(url, filename=None, reporthook=None, data=None) function in the urllib module. If you implement the reporthook -function/object as either a token bucket , or a leaky bucket, you have your global rate-limit. EDIT: Upon closer examination I see that it isn't as easy to do global rate-limit with reporthook as I thought. reporthook is only given the downloaded amount and the total size, which on their own isn't enough to

Using client certificates with urllib2

阅读更多关于 Using client certificates with urllib2

问题 I need to create a secure channel between my server and a remote web service. I'll be using HTTPS with a client certificate. I'll also need to validate the certificate presented by the remote service. How can I use my own client certificate with urllib2? What will I need to do in my code to ensure that the remote certificate is correct? 回答1: Here's a bug in the official Python bugtracker that looks relevant, and has a proposed patch. 回答2: Because alex's answer is a link, and the code on that

How to send a POST request using django?

阅读更多关于 How to send a POST request using django?

I dont want to use html file, but only with django I have to make POST request. Just like urllib2 sends a get request. A combination of methods from urllib2 and urllib will do the trick. Here is how I post data using the two: post_data = [('name','Gladys'),] # a sequence of two element tuples result = urllib2.urlopen('http://example.com', urllib.urlencode(post_data)) content = result.read() urlopen() is a method you use for opening urls. urlencode() converts the arguments to percent-encoded string. Here's how you'd write the accepted answer's example using python-requests : post_data = {'name'

How to make HTTP DELETE method using urllib2?

阅读更多关于 How to make HTTP DELETE method using urllib2?

Does urllib2 support DELETE or PUT method? If yes provide with any example please. I need to use piston API. Corey Goldberg you can do it with httplib : import httplib conn = httplib.HTTPConnection('www.foo.com') conn.request('PUT', '/myurl', body) resp = conn.getresponse() content = resp.read() also, check out this question . the accepted answer shows a way to add other methods to urllib2: import urllib2 opener = urllib2.build_opener(urllib2.HTTPHandler) request = urllib2.Request('http://example.org', data='your_put_data') request.add_header('Content-Type', 'your/contenttype') request.get

How do I download a zip file in python using urllib2?

阅读更多关于 How do I download a zip file in python using urllib2?

Two part question. I am trying to download multiple archived Cory Doctorow podcasts from the internet archive. The old one's that do not come into my iTunes feed. I have written the script but the downloaded files are not properly formatted. Q1 - What do I change to download the zip mp3 files? Q2 - What is a better way to pass the variables into URL? # and the base url. def dlfile(file_name,file_mode,base_url): from urllib2 import Request, urlopen, URLError, HTTPError #create the url and the request url = base_url + file_name + mid_url + file_name + end_url req = Request(url) # Open the url