python requests module and connection reuse

前端 未结 2 1883
一整个雨季
一整个雨季 2020-12-05 10:15

I am working with python\'s requests module for HTTP communication, and I am wondering how to reuse already-established TCP connections? The requests module is stateless and

2条回答
  •  死守一世寂寞
    2020-12-05 11:00

    Global functions like requests.get or requests.post create the requests.Session instance on each call. Connections made with these functions cannot be reused, because you cannot access automatically created session and use it's connection pool for subsequent requests. It's fine to use these functions if you have to do just a few requests. Otherwise you'll want to manage sessions yourself.

    Here is a quick display of requests behavior when you use global get function and session.

    Preparation, not really relevant to the question:

    >>> import logging, requests, timeit
    >>> logging.basicConfig(level=logging.DEBUG, format="%(message)s")
    

    See, a new connection is established each time you call get:

    >>> _ = requests.get("https://www.wikipedia.org")
    Starting new HTTPS connection (1): www.wikipedia.org
    >>> _ = requests.get("https://www.wikipedia.org")
    Starting new HTTPS connection (1): www.wikipedia.org
    

    But if you use the same session for subsequent calls, the connection gets reused:

    >>> session = requests.Session()
    >>> _ = session.get("https://www.wikipedia.org")
    Starting new HTTPS connection (1): www.wikipedia.org
    >>> _ = session.get("https://www.wikipedia.org")
    >>> _ = session.get("https://www.wikipedia.org")
    >>> _ = session.get("https://www.wikipedia.org")
    

    Performance:

    >>> timeit.timeit('_ = requests.get("https://www.wikipedia.org")', 'import requests', number=100)
    Starting new HTTPS connection (1): www.wikipedia.org
    Starting new HTTPS connection (1): www.wikipedia.org
    Starting new HTTPS connection (1): www.wikipedia.org
    ...
    Starting new HTTPS connection (1): www.wikipedia.org
    Starting new HTTPS connection (1): www.wikipedia.org
    Starting new HTTPS connection (1): www.wikipedia.org
    52.74904417991638
    >>> timeit.timeit('_ = session.get("https://www.wikipedia.org")', 'import requests; session = requests.Session()', number=100)
    Starting new HTTPS connection (1): www.wikipedia.org
    15.770191192626953
    

    Works much faster when you reuse the session (and thus session's connection pool).

提交回复
热议问题