What is the fastest way to send 100,000 HTTP requests in Python?

前端 未结 16 1324
暖寄归人
暖寄归人 2020-11-22 07:12

I am opening a file which has 100,000 URL\'s. I need to send an HTTP request to each URL and print the status code. I am using Python 2.6, and so far looked at the many con

16条回答
  •  天命终不由人
    2020-11-22 07:18

    I know this is an old question, but in Python 3.7 you can do this using asyncio and aiohttp.

    import asyncio
    import aiohttp
    from aiohttp import ClientSession, ClientConnectorError
    
    async def fetch_html(url: str, session: ClientSession, **kwargs) -> tuple:
        try:
            resp = await session.request(method="GET", url=url, **kwargs)
        except ClientConnectorError:
            return (url, 404)
        return (url, resp.status)
    
    async def make_requests(urls: set, **kwargs) -> None:
        async with ClientSession() as session:
            tasks = []
            for url in urls:
                tasks.append(
                    fetch_html(url=url, session=session, **kwargs)
                )
            results = await asyncio.gather(*tasks)
    
        for result in results:
            print(f'{result[1]} - {str(result[0])}')
    
    if __name__ == "__main__":
        import pathlib
        import sys
    
        assert sys.version_info >= (3, 7), "Script requires Python 3.7+."
        here = pathlib.Path(__file__).parent
    
        with open(here.joinpath("urls.txt")) as infile:
            urls = set(map(str.strip, infile))
    
        asyncio.run(make_requests(urls=urls))
    

    You can read more about it and see an example here.

提交回复
热议问题