Fetch a file from a local url with Python requests?

后端 未结 6 1232
误落风尘
误落风尘 2020-12-02 14:29

I am using Python\'s requests library in one method of my application. The body of the method looks like this:

def handle_remote_file(url, **kwargs):
    res         


        
相关标签:
6条回答
  • 2020-12-02 15:00

    Here's a transport adapter I wrote which is more featureful than b1r3k's and has no additional dependencies beyond Requests itself. I haven't tested it exhaustively yet, but what I have tried seems to be bug-free.

    import requests
    import os, sys
    
    if sys.version_info.major < 3:
        from urllib import url2pathname
    else:
        from urllib.request import url2pathname
    
    class LocalFileAdapter(requests.adapters.BaseAdapter):
        """Protocol Adapter to allow Requests to GET file:// URLs
    
        @todo: Properly handle non-empty hostname portions.
        """
    
        @staticmethod
        def _chkpath(method, path):
            """Return an HTTP status for the given filesystem path."""
            if method.lower() in ('put', 'delete'):
                return 501, "Not Implemented"  # TODO
            elif method.lower() not in ('get', 'head'):
                return 405, "Method Not Allowed"
            elif os.path.isdir(path):
                return 400, "Path Not A File"
            elif not os.path.isfile(path):
                return 404, "File Not Found"
            elif not os.access(path, os.R_OK):
                return 403, "Access Denied"
            else:
                return 200, "OK"
    
        def send(self, req, **kwargs):  # pylint: disable=unused-argument
            """Return the file specified by the given request
    
            @type req: C{PreparedRequest}
            @todo: Should I bother filling `response.headers` and processing
                   If-Modified-Since and friends using `os.stat`?
            """
            path = os.path.normcase(os.path.normpath(url2pathname(req.path_url)))
            response = requests.Response()
    
            response.status_code, response.reason = self._chkpath(req.method, path)
            if response.status_code == 200 and req.method.lower() != 'head':
                try:
                    response.raw = open(path, 'rb')
                except (OSError, IOError) as err:
                    response.status_code = 500
                    response.reason = str(err)
    
            if isinstance(req.url, bytes):
                response.url = req.url.decode('utf-8')
            else:
                response.url = req.url
    
            response.request = req
            response.connection = self
    
            return response
    
        def close(self):
            pass
    

    (Despite the name, it was completely written before I thought to check Google, so it has nothing to do with b1r3k's.) As with the other answer, follow this with:

    requests_session = requests.session()
    requests_session.mount('file://', LocalFileAdapter())
    r = requests_session.get('file:///path/to/your/file')
    
    0 讨论(0)
  • 2020-12-02 15:04

    The easiest way seems using requests-file. https://github.com/dashea/requests-file (available through PyPI too)

    "Requests-File is a transport adapter for use with the Requests Python library to allow local filesystem access via file:// URLs."

    This in combination with requests-html is pure magic :)

    0 讨论(0)
  • 2020-12-02 15:09

    I think simple solution for this will be creating temporary http server using python and using it.

    1. Put all your files in temporary folder eg. tempFolder
    2. Go to that directory and create a temporary http server in terminal/cmd as per your OS using command python -m http.server 8000 (Note 8000 is port no.)
    3. This will you give you a link to http server. You can access it from http://127.0.0.1:8000/
    4. Open your desired file in browser and copy the link to your url.
    0 讨论(0)
  • 2020-12-02 15:10

    packages/urllib3/poolmanager.py pretty much explains it. Requests doesn't support local url.

    pool_classes_by_scheme = {                                                        
        'http': HTTPConnectionPool,                                                   
        'https': HTTPSConnectionPool,                                              
    }                                                                                 
    
    0 讨论(0)
  • 2020-12-02 15:15

    As @WooParadog explained requests library doesn't know how to handle local files. Although, current version allows to define transport adapters.

    Therefore you can simply define you own adapter which will be able to handle local files, e.g.:

    from requests_testadapter import Resp
    
    class LocalFileAdapter(requests.adapters.HTTPAdapter):
        def build_response_from_file(self, request):
            file_path = request.url[7:]
            with open(file_path, 'rb') as file:
                buff = bytearray(os.path.getsize(file_path))
                file.readinto(buff)
                resp = Resp(buff)
                r = self.build_response(request, resp)
    
                return r
    
        def send(self, request, stream=False, timeout=None,
                 verify=True, cert=None, proxies=None):
    
            return self.build_response_from_file(request)
    
    requests_session = requests.session()
    requests_session.mount('file://', LocalFileAdapter())
    requests_session.get('file://<some_local_path>')
    

    I'm using requests-testadapter module in the above example.

    0 讨论(0)
  • 2020-12-02 15:26

    In a recent project, I've had the same issue. Since requests doesn't support the "file" scheme, I'll patch our code to load the content locally. First, I define a function to replace requests.get:

    def local_get(self, url):
        "Fetch a stream from local files."
        p_url = six.moves.urllib.parse.urlparse(url)
        if p_url.scheme != 'file':
            raise ValueError("Expected file scheme")
    
        filename = six.moves.urllib.request.url2pathname(p_url.path)
        return open(filename, 'rb')
    

    Then, somewhere in test setup or decorating the test function, I use mock.patch to patch the get function on requests:

    @mock.patch('requests.get', local_get)
    def test_handle_remote_file(self):
        ...
    

    This technique is somewhat brittle -- it doesn't help if the underlying code calls requests.request or constructs a Session and calls that. There may be a way to patch requests at a lower level to support file: URLs, but in my initial investigation, there didn't seem to be an obvious hook point, so I went with this simpler approach.

    0 讨论(0)
提交回复
热议问题