How do I read the response headers returned from a PyCurl request?
可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试):
问题:
回答1:
There are several solutions (by default, they are dropped). Here is an example using the option HEADERFUNCTION which lets you indicate a function to handle them.
Other solutions are options WRITEHEADER (not compatible with WRITEFUNCTION) or setting HEADER to True so that they are transmitted with the body.
#!/usr/bin/python import pycurl import sys class Storage: def __init__(self): self.contents = '' self.line = 0 def store(self, buf): self.line = self.line + 1 self.contents = "%s%i: %s" % (self.contents, self.line, buf) def __str__(self): return self.contents retrieved_body = Storage() retrieved_headers = Storage() c = pycurl.Curl() c.setopt(c.URL, 'http://www.demaziere.fr/eve/') c.setopt(c.WRITEFUNCTION, retrieved_body.store) c.setopt(c.HEADERFUNCTION, retrieved_headers.store) c.perform() c.close() print retrieved_headers print retrieved_body
回答2:
import pycurl from StringIO import StringIO headers = StringIO() c = pycurl.Curl() c.setopt(c.URL, url) c.setopt(c.HEADER, 1) c.setopt(c.NOBODY, 1) # header only, no body c.setopt(c.HEADERFUNCTION, headers.write) c.perform() print headers.getvalue()
Add any other curl setopts as necessary/desired, such as FOLLOWLOCATION.
回答3:
Anothr alternate, human_curl usage: pip human_curl
In [1]: import human_curl as hurl In [2]: r = hurl.get("http://stackoverflow.com") In [3]: r.headers Out[3]: {'cache-control': 'public, max-age=45', 'content-length': '198515', 'content-type': 'text/html; charset=utf-8', 'date': 'Thu, 01 Sep 2011 11:53:43 GMT', 'expires': 'Thu, 01 Sep 2011 11:54:28 GMT', 'last-modified': 'Thu, 01 Sep 2011 11:53:28 GMT', 'vary': '*'}
回答4:
This might or might not be an alternative for you:
import urllib headers = urllib.urlopen('http://www.pythonchallenge.com').headers.headers