Python - Example of urllib2 asynchronous / threaded request using HTTPS

走远了吗. 提交于 2019-12-20 09:55:53

问题


I'm having a heck of a time getting asynchronous / threaded HTTPS requests to work using Python's urllib2.

Does anyone out there have a basic example that implements urllib2.Request, urllib2.build_opener and a subclass of urllib2.HTTPSHandler?

Thanks!


回答1:


The code below does 7 http requests asynchronously at the same time. It does not use threads, instead it uses asynchronous networking with the twisted library.

from twisted.web import client
from twisted.internet import reactor, defer

urls = [
 'http://www.python.org', 
 'http://stackoverflow.com', 
 'http://www.twistedmatrix.com', 
 'http://www.google.com',
 'http://launchpad.net',
 'http://github.com',
 'http://bitbucket.org',
]

def finish(results):
    for result in results:
        print 'GOT PAGE', len(result), 'bytes'
    reactor.stop()

waiting = [client.getPage(url) for url in urls]
defer.gatherResults(waiting).addCallback(finish)

reactor.run()



回答2:


there's a really simple way, involving a handler for urllib2, which you can find here: http://pythonquirks.blogspot.co.uk/2009/12/asynchronous-http-request.html

#!/usr/bin/env python

import urllib2
import threading

class MyHandler(urllib2.HTTPHandler):
    def http_response(self, req, response):
        print "url: %s" % (response.geturl(),)
        print "info: %s" % (response.info(),)
        for l in response:
            print l
        return response

o = urllib2.build_opener(MyHandler())
t = threading.Thread(target=o.open, args=('http://www.google.com/',))
t.start()
print "I'm asynchronous!"

t.join()

print "I've ended!"



回答3:


here is an example using urllib2 (with https) and threads. Each thread cycles through a list of URL's and retrieves the resource.

import itertools
import urllib2
from threading import Thread


THREADS = 2
URLS = (
    'https://foo/bar',
    'https://foo/baz',
    )


def main():
    for _ in range(THREADS):
        t = Agent(URLS)
        t.start()


class Agent(Thread):
    def __init__(self, urls):
        Thread.__init__(self)
        self.urls = urls

    def run(self):
        urls = itertools.cycle(self.urls)
        while True:
            data = urllib2.urlopen(urls.next()).read()


if __name__ == '__main__':
    main()



回答4:


You can use asynchronous IO to do this.

requests + gevent = grequests

GRequests allows you to use Requests with Gevent to make asynchronous HTTP Requests easily.

import grequests

urls = [
    'http://www.heroku.com',
    'http://tablib.org',
    'http://httpbin.org',
    'http://python-requests.org',
    'http://kennethreitz.com'
]

rs = (grequests.get(u) for u in urls)
grequests.map(rs)



回答5:


here is the code from eventlet

urls = ["http://www.google.com/intl/en_ALL/images/logo.gif",
     "https://wiki.secondlife.com/w/images/secondlife.jpg",
     "http://us.i1.yimg.com/us.yimg.com/i/ww/beta/y3.gif"]

import eventlet
from eventlet.green import urllib2

def fetch(url):

  return urllib2.urlopen(url).read()

pool = eventlet.GreenPool()

for body in pool.imap(fetch, urls):
  print "got body", len(body)


来源:https://stackoverflow.com/questions/5808138/python-example-of-urllib2-asynchronous-threaded-request-using-https

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!