Asynchronous HTTP calls in Python

后端 未结 4 624
失恋的感觉
失恋的感觉 2020-12-05 15:29

I have a need for a callback kind of functionality in Python where I am sending a request to a webservice multiple times, with a change in the parameter each time. I want t

相关标签:
4条回答
  • 2020-12-05 16:10

    Starting in Python 3.2, you can use concurrent.futures for launching parallel tasks.

    Check out this ThreadPoolExecutor example:

    http://docs.python.org/dev/library/concurrent.futures.html#threadpoolexecutor-example

    It spawns threads to retrieve HTML and acts on responses as they are received.

    import concurrent.futures
    import urllib.request
    
    URLS = ['http://www.foxnews.com/',
            'http://www.cnn.com/',
            'http://europe.wsj.com/',
            'http://www.bbc.co.uk/',
            'http://some-made-up-domain.com/']
    
    # Retrieve a single page and report the url and contents
    def load_url(url, timeout):
        conn = urllib.request.urlopen(url, timeout=timeout)
        return conn.readall()
    
    # We can use a with statement to ensure threads are cleaned up promptly
    with concurrent.futures.ThreadPoolExecutor(max_workers=5) as executor:
        # Start the load operations and mark each future with its URL
        future_to_url = {executor.submit(load_url, url, 60): url for url in URLS}
        for future in concurrent.futures.as_completed(future_to_url):
            url = future_to_url[future]
            try:
                data = future.result()
            except Exception as exc:
                print('%r generated an exception: %s' % (url, exc))
            else:
                print('%r page is %d bytes' % (url, len(data)))
    

    The above example uses threading. There is also a similar ProcessPoolExecutor that uses a pool of processes, rather than threads:

    http://docs.python.org/dev/library/concurrent.futures.html#processpoolexecutor-example

    import concurrent.futures
    import urllib.request
    
    URLS = ['http://www.foxnews.com/',
            'http://www.cnn.com/',
            'http://europe.wsj.com/',
            'http://www.bbc.co.uk/',
            'http://some-made-up-domain.com/']
    
    # Retrieve a single page and report the url and contents
    def load_url(url, timeout):
        conn = urllib.request.urlopen(url, timeout=timeout)
        return conn.readall()
    
    # We can use a with statement to ensure threads are cleaned up promptly
    with concurrent.futures.ThreadPoolExecutor(max_workers=5) as executor:
        # Start the load operations and mark each future with its URL
        future_to_url = {executor.submit(load_url, url, 60): url for url in URLS}
        for future in concurrent.futures.as_completed(future_to_url):
            url = future_to_url[future]
            try:
                data = future.result()
            except Exception as exc:
                print('%r generated an exception: %s' % (url, exc))
            else:
                print('%r page is %d bytes' % (url, len(data)))
    
    0 讨论(0)
  • 2020-12-05 16:11

    (Although this thread is about server-side Python. Since this question was asked a while back. Others might stumble on this where they are looking for a similar answer on the client side)

    For a client side solution, you might want to take a look at Async.js library especially the "Control-Flow" section.

    https://github.com/caolan/async#control-flow

    By combining the "Parallel" with a "Waterfall" you can achieve your desired result.

    WaterFall( Parallel(TaskA, TaskB, TaskC) -> PostParallelTask)

    If you examine the example under Control-Flow - "Auto" they give you an example of the above: https://github.com/caolan/async#autotasks-callback where "write-file" depends on "get_data" and "make_folder" and "email_link" depends on write-file".

    Please note that all of this happens on the client side (unless you're doing Node.JS - on the server-side)

    For server-side Python, look at PyCURL @ https://github.com/pycurl/pycurl/blob/master/examples/basicfirst.py

    By combining the example below with pyCurl, you can achieve the non-blocking multi-threaded functionality.

    Hope this helps. Good luck.

    Venkatt @ http://MyThinkpond.com

    0 讨论(0)
  • 2020-12-05 16:13

    Do you know about eventlet? It lets you write what appears to be synchronous code, but have it operate asynchronously over the network.

    Here's an example of a super minimal crawler:

    urls = ["http://www.google.com/intl/en_ALL/images/logo.gif",
         "https://wiki.secondlife.com/w/images/secondlife.jpg",
         "http://us.i1.yimg.com/us.yimg.com/i/ww/beta/y3.gif"]
    
    import eventlet
    from eventlet.green import urllib2
    
    def fetch(url):
    
      return urllib2.urlopen(url).read()
    
    pool = eventlet.GreenPool()
    
    for body in pool.imap(fetch, urls):
      print "got body", len(body)
    
    0 讨论(0)
  • 2020-12-05 16:20

    Twisted framework is just the ticket for that. But if you don't want to take that on you might also use pycurl, wrapper for libcurl, that has its own async event loop and supports callbacks.

    0 讨论(0)
提交回复
热议问题