Progress of Python requests post

后端 未结 5 902
我在风中等你
我在风中等你 2020-12-13 05:29

I am uploading a large file using the Python requests package, and I can\'t find any way to give data back about the progress of the upload. I have seen a number of progress

相关标签:
5条回答
  • 2020-12-13 06:03

    I recommend to use a tool package named requests-toolbelt, which make monitoring upload bytes very easy, like

    from requests_toolbelt import MultipartEncoder, MultipartEncoderMonitor
    import requests
    
    def my_callback(monitor):
        # Your callback function
        print monitor.bytes_read
    
    e = MultipartEncoder(
        fields={'field0': 'value', 'field1': 'value',
                'field2': ('filename', open('file.py', 'rb'), 'text/plain')}
        )
    m = MultipartEncoderMonitor(e, my_callback)
    
    r = requests.post('http://httpbin.org/post', data=m,
                      headers={'Content-Type': m.content_type})
    

    And you may want to read this to show a progress bar.

    0 讨论(0)
  • 2020-12-13 06:10

    requests doesn't support upload streaming e.g.:

    import os
    import sys
    import requests  # pip install requests
    
    class upload_in_chunks(object):
        def __init__(self, filename, chunksize=1 << 13):
            self.filename = filename
            self.chunksize = chunksize
            self.totalsize = os.path.getsize(filename)
            self.readsofar = 0
    
        def __iter__(self):
            with open(self.filename, 'rb') as file:
                while True:
                    data = file.read(self.chunksize)
                    if not data:
                        sys.stderr.write("\n")
                        break
                    self.readsofar += len(data)
                    percent = self.readsofar * 1e2 / self.totalsize
                    sys.stderr.write("\r{percent:3.0f}%".format(percent=percent))
                    yield data
    
        def __len__(self):
            return self.totalsize
    
    # XXX fails
    r = requests.post("http://httpbin.org/post",
                      data=upload_in_chunks(__file__, chunksize=10))
    

    btw, if you don't need to report progress; you could use memory-mapped file to upload large file.

    To workaround it, you could create a file adaptor similar to the one from urllib2 POST progress monitoring:

    class IterableToFileAdapter(object):
        def __init__(self, iterable):
            self.iterator = iter(iterable)
            self.length = len(iterable)
    
        def read(self, size=-1): # TBD: add buffer for `len(data) > size` case
            return next(self.iterator, b'')
    
        def __len__(self):
            return self.length
    

    Example

    it = upload_in_chunks(__file__, 10)
    r = requests.post("http://httpbin.org/post", data=IterableToFileAdapter(it))
    
    # pretty print
    import json
    json.dump(r.json, sys.stdout, indent=4, ensure_ascii=False)
    
    0 讨论(0)
  • 2020-12-13 06:17

    I got it working with the code from here: Simple file upload progressbar in PyQt. I changed it a bit, to use BytesIO instead of StringIO.

    class CancelledError(Exception):
        def __init__(self, msg):
            self.msg = msg
            Exception.__init__(self, msg)
    
        def __str__(self):
            return self.msg
    
        __repr__ = __str__
    
    class BufferReader(BytesIO):
        def __init__(self, buf=b'',
                     callback=None,
                     cb_args=(),
                     cb_kwargs={}):
            self._callback = callback
            self._cb_args = cb_args
            self._cb_kwargs = cb_kwargs
            self._progress = 0
            self._len = len(buf)
            BytesIO.__init__(self, buf)
    
        def __len__(self):
            return self._len
    
        def read(self, n=-1):
            chunk = BytesIO.read(self, n)
            self._progress += int(len(chunk))
            self._cb_kwargs.update({
                'size'    : self._len,
                'progress': self._progress
            })
            if self._callback:
                try:
                    self._callback(*self._cb_args, **self._cb_kwargs)
                except: # catches exception from the callback
                    raise CancelledError('The upload was cancelled.')
            return chunk
    
    
    def progress(size=None, progress=None):
        print("{0} / {1}".format(size, progress))
    
    
    files = {"upfile": ("file.bin", open("file.bin", 'rb').read())}
    
    (data, ctype) = requests.packages.urllib3.filepost.encode_multipart_formdata(files)
    
    headers = {
        "Content-Type": ctype
    }
    
    body = BufferReader(data, progress)
    requests.post(url, data=body, headers=headers)
    

    The trick is, to generate data and header from the files list manually, using encode_multipart_formdata() from urllib3

    0 讨论(0)
  • 2020-12-13 06:19

    Usually you would build a streaming datasource (a generator) that reads the file chunked and reports its progress on the way (see kennethreitz/requests#663. This does not work with requests file-api, because requests doesn’t support streaming uploads (see kennethreitz/requests#295) – a file to upload needs to be complete in memory before it starts getting processed.

    but requests can stream content from a generator as J.F. Sebastian has proven before, but this generator needs to generate the complete datastream including the multipart encoding and boundaries. This is where poster comes to play.

    poster is originally written to be used with pythons urllib2 and supports streaming generation of multipart requests, providing progress indication as it goes along. Posters Homepage provides examples of using it together with urllib2 but you really don’t want to use urllib2. Check out this example-code on how to to HTTP Basic Authentication with urllib2. Horrrrrrrrible.

    So we really want to use poster together with requests to do file uploads with tracked progress. And here is how:

    # load requests-module, a streamlined http-client lib
    import requests
    
    # load posters encode-function
    from poster.encode import multipart_encode
    
    
    
    # an adapter which makes the multipart-generator issued by poster accessable to requests
    # based upon code from http://stackoverflow.com/a/13911048/1659732
    class IterableToFileAdapter(object):
        def __init__(self, iterable):
            self.iterator = iter(iterable)
            self.length = iterable.total
    
        def read(self, size=-1):
            return next(self.iterator, b'')
    
        def __len__(self):
            return self.length
    
    # define a helper function simulating the interface of posters multipart_encode()-function
    # but wrapping its generator with the file-like adapter
    def multipart_encode_for_requests(params, boundary=None, cb=None):
        datagen, headers = multipart_encode(params, boundary, cb)
        return IterableToFileAdapter(datagen), headers
    
    
    
    # this is your progress callback
    def progress(param, current, total):
        if not param:
            return
    
        # check out http://tcd.netinf.eu/doc/classnilib_1_1encode_1_1MultipartParam.html
        # for a complete list of the properties param provides to you
        print "{0} ({1}) - {2:d}/{3:d} - {4:.2f}%".format(param.name, param.filename, current, total, float(current)/float(total)*100)
    
    # generate headers and gata-generator an a requests-compatible format
    # and provide our progress-callback
    datagen, headers = multipart_encode_for_requests({
        "input_file": open('recordings/really-large.mp4', "rb"),
        "another_input_file": open('recordings/even-larger.mp4', "rb"),
    
        "field": "value",
        "another_field": "another_value",
    }, cb=progress)
    
    # use the requests-lib to issue a post-request with out data attached
    r = requests.post(
        'https://httpbin.org/post',
        auth=('user', 'password'),
        data=datagen,
        headers=headers
    )
    
    # show response-code and -body
    print r, r.text
    
    0 讨论(0)
  • 2020-12-13 06:19

    My upload server doesn't support Chunk-Encoded so I came up with this solution. It basically just a wrapper around python IOBase and allow tqdm.wrapattr to work seamless.

    import io
    import requests
    from typing import Union
    from tqdm import tqdm
    from tqdm.utils import CallbackIOWrapper
    
    class UploadChunksIterator(Iterable):
        """
        This is an interface between python requests and tqdm.
        Make tqdm to be accessed just like IOBase for requests lib.
        """
    
        def __init__(
            self, file: Union[io.BufferedReader, CallbackIOWrapper], total_size: int, chunk_size: int = 16 * 1024
        ):  # 16MiB
            self.file = file
            self.chunk_size = chunk_size
            self.total_size = total_size
    
        def __iter__(self):
            return self
    
        def __next__(self):
            data = self.file.read(self.chunk_size)
            if not data:
                raise StopIteration
            return data
    
        # we dont retrive len from io.BufferedReader because CallbackIOWrapper only has read() method.
        def __len__(self):
            return self.total_size
    
    fp = "data/mydata.mp4"
    s3url = "example.com"
    _quiet = False
    
    with open(fp, "rb") as f:
        total_size = os.fstat(f.fileno()).st_size
        if not _quiet:
            f = tqdm.wrapattr(f, "read", desc=hv, miniters=1, total=total_size, ascii=True)
    
        with f as f_iter:
            res = requests.put(
                url=s3url,
                data=UploadChunksIterator(f_iter, total_size=total_size),
            )
        res.raise_for_status()
    
    0 讨论(0)
提交回复
热议问题