aiohttp: set maximum number of requests per second

前端 未结 4 1052
耶瑟儿~
耶瑟儿~ 2020-12-05 07:11

How can I set maximum number of requests per second (limit them) in client side using aiohttp?

4条回答
  •  谎友^
    谎友^ (楼主)
    2020-12-05 07:27

    You could set a delay per request or group the URLs in batches and throttle the batches to meet desired frequency.

    1. Delay per request

    Force the script to wait in between requests using asyncio.sleep

    import asyncio
    import aiohttp
    
    delay_per_request = 0.5
    urls = [
       # put some URLs here...
    ]
    
    async def app():
        tasks = []
        for url in urls:
            tasks.append(asyncio.ensure_future(make_request(url)))
            await asyncio.sleep(delay_per_request)
    
        results = await asyncio.gather(*tasks)
        return results
    
    async def make_request(url):
        print('$$$ making request')
        async with aiohttp.ClientSession() as sess:
            async with sess.get(url) as resp:
                status = resp.status
                text = await resp.text()
                print('### got page data')
                return url, status, text
    

    This can be run with e.g. results = asyncio.run(app()).

    2. Batch throttle

    Using make_request from above, you can request and throttle batches of URLs like this:

    import asyncio
    import aiohttp
    import time
    
    max_requests_per_second = 0.5
    urls = [[
       # put a few URLs here...
    ],[
       # put a few more URLs here...
    ]]
    
    async def app():
        results = []
        for i, batch in enumerate(urls):
            t_0 = time.time()
            print(f'batch {i}')
            tasks = [asyncio.ensure_future(make_request(url)) for url in batch]
            for t in tasks:
                d = await t
                results.append(d)
            t_1 = time.time()
    
            # Throttle requests
            batch_time = (t_1 - t_0)
            batch_size = len(batch)
            wait_time = (batch_size / max_requests_per_second) - batch_time
            if wait_time > 0:
                print(f'Too fast! Waiting {wait_time} seconds')
                time.sleep(wait_time)
    
        return results
    

    Again, this can be run with asyncio.run(app()).

提交回复
热议问题