Is there a convenient way to map a file uri to os.path?

前端 未结 5 1344
栀梦
栀梦 2020-12-14 07:52

A subsystem which I have no control over insists on providing filesystem paths in the form of a uri. Is there a python module/function which can convert this path into the a

5条回答
  •  时光取名叫无心
    2020-12-14 08:27

    Of all the answers so far, I found none that catch edge cases, doesn't require branching, are both 2/3 compatible, and cross-platform.

    In short, this does the job, using only builtins:

    try:
        from urllib.parse import urlparse, unquote
        from urllib.request import url2pathname
    except ImportError:
        # backwards compatability
        from urlparse import urlparse
        from urllib import unquote, url2pathname
    
    
    def uri_to_path(uri):
        parsed = urlparse(uri)
        host = "{0}{0}{mnt}{0}".format(os.path.sep, mnt=parsed.netloc)
        return os.path.normpath(
            os.path.join(host, url2pathname(unquote(parsed.path)))
        )
    

    The tricky bit (I found) was when working in Windows with paths specifying a host. This is a non-issue outside of Windows: network locations in *NIX can only be reached via paths after being mounted to the root of the filesystem.

    From Wikipedia: A file URI takes the form of file://host/path , where host is the fully qualified domain name of the system on which the path is accessible [...]. If host is omitted, it is taken to be "localhost".

    With that in mind, I make it a rule to ALWAYS prefix the path with the netloc provided by urlparse, before passing it to os.path.abspath, which is necessary as it removes any resulting redundant slashes (os.path.normpath, which also claims to fix the slashes, can get a little over-zealous in Windows, hence the use of abspath).

    The other crucial component in the conversion is using unquote to escape/decode the URL percent-encoding, which your filesystem won't otherwise understand. Again, this might be a bigger issue on Windows, which allows things like $ and spaces in paths, which will have been encoded in the file URI.

    For a demo:

    import os
    from pathlib import Path   # This demo requires pip install for Python < 3.4
    import sys
    try:
        from urllib.parse import urlparse, unquote
        from urllib.request import url2pathname
    except ImportError:  # backwards compatability:
        from urlparse import urlparse
        from urllib import unquote, url2pathname
    
    DIVIDER = "-" * 30
    
    if sys.platform == "win32":  # WINDOWS
        filepaths = [
            r"C:\Python27\Scripts\pip.exe",
            r"C:\yikes\paths with spaces.txt",
            r"\\localhost\c$\WINDOWS\clock.avi",
            r"\\networkstorage\homes\rdekleer",
        ]
    else:  # *NIX
        filepaths = [
            os.path.expanduser("~/.profile"),
            "/usr/share/python3/py3versions.py",
        ]
    
    for path in filepaths:
        uri = Path(path).as_uri()
        parsed = urlparse(uri)
        host = "{0}{0}{mnt}{0}".format(os.path.sep, mnt=parsed.netloc)
        normpath = os.path.normpath(
            os.path.join(host, url2pathname(unquote(parsed.path)))
        )
        absolutized = os.path.abspath(
            os.path.join(host, url2pathname(unquote(parsed.path)))
        )
        result = ("{DIVIDER}"
                  "\norig path:       \t{path}"
                  "\nconverted to URI:\t{uri}"
                  "\nrebuilt normpath:\t{normpath}"
                  "\nrebuilt abspath:\t{absolutized}").format(**locals())
        print(result)
        assert path == absolutized
    

    Results (WINDOWS):

    ------------------------------
    orig path:              C:\Python27\Scripts\pip.exe
    converted to URI:       file:///C:/Python27/Scripts/pip.exe
    rebuilt normpath:       C:\Python27\Scripts\pip.exe
    rebuilt abspath:        C:\Python27\Scripts\pip.exe
    ------------------------------
    orig path:              C:\yikes\paths with spaces.txt
    converted to URI:       file:///C:/yikes/paths%20with%20spaces.txt
    rebuilt normpath:       C:\yikes\paths with spaces.txt
    rebuilt abspath:        C:\yikes\paths with spaces.txt
    ------------------------------
    orig path:              \\localhost\c$\WINDOWS\clock.avi
    converted to URI:       file://localhost/c%24/WINDOWS/clock.avi
    rebuilt normpath:       \localhost\c$\WINDOWS\clock.avi
    rebuilt abspath:        \\localhost\c$\WINDOWS\clock.avi
    ------------------------------
    orig path:              \\networkstorage\homes\rdekleer
    converted to URI:       file://networkstorage/homes/rdekleer
    rebuilt normpath:       \networkstorage\homes\rdekleer
    rebuilt abspath:        \\networkstorage\homes\rdekleer
    

    Results (*NIX):

    ------------------------------
    orig path:              /home/rdekleer/.profile
    converted to URI:       file:///home/rdekleer/.profile
    rebuilt normpath:       /home/rdekleer/.profile
    rebuilt abspath:        /home/rdekleer/.profile
    ------------------------------
    orig path:              /usr/share/python3/py3versions.py
    converted to URI:       file:///usr/share/python3/py3versions.py
    rebuilt normpath:       /usr/share/python3/py3versions.py
    rebuilt abspath:        /usr/share/python3/py3versions.py
    

提交回复
热议问题