Unicode filenames on Windows with Python & subprocess.Popen()

前端 未结 5 1041
独厮守ぢ
独厮守ぢ 2020-11-27 08:04

Why does the following occur:

>>> u\'\\u0308\'.encode(\'mbcs\')   #UMLAUT
\'\\xa8\'
>>> u\'\\u041A\'.encode(\'mbcs\')   #CYRILLIC CAPITAL L         


        
5条回答
  •  时光说笑
    2020-11-27 08:44

    DISCLAIMER: I'm the author of the fix mentionned in the following.

    To support unicode command line on windows with python 2.7, you can use this patch to subprocess.Popen(..)

    The situation

    Python 2 support of unicode command line on windows is very poor.

    Are severly bugged:

    • issuing the unicode command line to the system from the caller side (via subprocess.Popen(..)),

    • and reading the current command line unicode arguments from the callee side (via sys.argv),

    It is acknowledged and won't be fixed on Python 2. These are fixed in Python 3.

    Technical Reasons

    In Python 2, windows implementation of subprocess.Popen(..) and sys.argv use the non unicode ready windows systems call CreateProcess(..) (see python code, and MSDN doc of CreateProcess) and does not use GetCommandLineW(..) for sys.argv.

    In Python 3, windows implementation of subprocess.Popen(..) make use of the correct windows systems calls CreateProcessW(..) starting from 3.0 (see code in 3.0) and sys.argv uses GetCommandLineW(..) starting from 3.3 (see code in 3.3).

    How is it fixed

    The given patch will leverage ctypes module to call C windows system CreateProcessW(..) directly. It proposes a new fixed Popen object by overriding private method Popen._execute_child(..) and private function _subprocess.CreateProcess(..) to setup and use CreateProcessW(..) from windows system lib in a way that mimics as much as possible how it is done in Python 3.6.

    How to use it

    How to use the given patch is demonstrated with this blog post explanation. It additionally shows how to read the current processes sys.argv with another fix.

提交回复
热议问题