http request with timeout, maximum size and connection pooling

前端 未结 1 1592
既然无缘
既然无缘 2020-12-14 23:36

I\'m looking for a way in Python (2.7) to do HTTP requests with 3 requirements:

  • timeout (for reliability)
  • content maximum size (for security)
相关标签:
1条回答
  • 2020-12-15 00:20

    You can do it with requests just fine; but you need to know that the raw object is part of the urllib3 guts and make use of the extra arguments the HTTPResponse.read() call supports, which lets you specify you want to read decoded data:

    import requests
    r = requests.get('https://github.com/timeline.json', timeout=5, stream=True)
    
    content = r.raw.read(100000+1, decode_content=True)
    if len(content) > 100000:
        raise ValueError('Too large a response')
    print content
    print json.loads(content)
    

    Alternatively, you can set the decode_content flag on the raw object before reading:

    import requests
    r = requests.get('https://github.com/timeline.json', timeout=5, stream=True)
    
    r.raw.decode_content = True
    content = r.raw.read(100000+1)
    if len(content) > 100000:
        raise ValueError('Too large a response')
    print content
    print json.loads(content)
    

    If you don't like reaching into urllib3 guts like that, use the response.iter_content() to iterate over the decoded content in chunks; this uses the underlying HTTPResponse too (using the .stream() generator version:

    import requests
    
    r = requests.get('https://github.com/timeline.json', timeout=5, stream=True)
    
    maxsize = 100000
    content = ''
    for chunk in r.iter_content(2048):
        content += chunk
        if len(content) > maxsize:
            r.close()
            raise ValueError('Response too large')
    
    print content
    print json.loads(content)
    

    There is of subtle difference here in how compressed data sizes are handled here; r.raw.read(100000+1) will only ever read 100k bytes of compressed data; the uncompressed data is tested against your max size. The iter_content() method will read more uncompressed data in the rare case the compressed stream is larger than the uncompressed data.

    Neither method allows r.json() to work; the response._content attribute isn't set by these; you can do so manually of course. But since the .raw.read() and .iter_content() calls already give you access to the content in question, there is really no need.

    0 讨论(0)
提交回复
热议问题