Python os.stat and unicode file names

前端 未结 5 819
我寻月下人不归
我寻月下人不归 2020-12-31 05:05

In my Django application, a user has uploaded a file with a unicode character in the name.

When I\'m downloading files, I\'m calling :

os.path.exists         


        
相关标签:
5条回答
  • 2020-12-31 05:18

    It is easy to get this kind of error when running service (E.g: gunicorn) from Upstart.

    To fix that, set env in upstart file:

    env LANG=en_US.UTF-8
    env LC_CTYPE=en_US.UTF-8
    env LC_ALL=en_US.UTF-8
    
    0 讨论(0)
  • 2020-12-31 05:26

    I'm assuming you're in Unix. If not, please remember to say which OS you're in.

    Make sure your locale is set to UTF-8. All modern Linux systems do this by default, usually by setting the environment variable LANG to "en_US.UTF-8", or another language. Also, make sure your filenames are encoded in UTF-8.

    With that set, there's no need to mess with encodings to access files in any language, even in Python 2.x.

    [~/test] echo $LANG
    en_US.UTF-8
    [~/test] echo testing > 漢字
    [~/test] python2.6
    Python 2.6.2 (release26-maint, Apr 19 2009, 01:56:41)
    [GCC 4.3.3] on linux2
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import os
    >>> os.stat("漢字")
    posix.stat_result(st_mode=33188, st_ino=548583333L, st_dev=2049L, st_nlink=1, st_uid=1000, st_gid=1000, st_size=8L, st_atime=1263634240, st_mtime=1263634230, st_ctime=1263634230)
    >>> os.stat(u"漢字")
    posix.stat_result(st_mode=33188, st_ino=548583333L, st_dev=2049L, st_nlink=1, st_uid=1000, st_gid=1000, st_size=8L, st_atime=1263634240, st_mtime=1263634230, st_ctime=1263634230)
    >>> open("漢字").read()
    'testing\n'
    >>> open(u"漢字").read()
    'testing\n'
    

    If this doesn't work, run "locale"; if the values are "C" instead of en_US.UTF-8, you may not have the locale installed correctly.

    If you're in Windows, I think Unicode filenames should always just work (at least for the os/posix modules), since the Unicode file API in Windows is supported transparently.

    0 讨论(0)
  • 2020-12-31 05:26

    None of these solutions worked for me. However, I did find the (a?) solution. There is yet another place in Apache settings where one has to add the locale setting if one uses WSGI. Official docs are here. Add the following two lines to /etc/apache2/envvars (on Ubuntu):

    export LANG='en_US.UTF-8'
    export LC_ALL='en_US.UTF-8'
    

    Then restart the server. This solved my problem.

    0 讨论(0)
  • 2020-12-31 05:34

    Encode to the filesystem encoding before calling. See the locale module.

    0 讨论(0)
  • 2020-12-31 05:40

    Change your http server to use UTF-8 locale. For example, I use apache2 on CentOS. I changed /etc/sysconfig/httpd locale setting by HTTPD_LANG:

    # CentOS use /etc/sysconfig/httpd to config environment variables.
    #
    # By default, the httpd process is started in the C locale; to
    # change the locale in which the server runs, the HTTPD_LANG
    # variable can be set.
    #
    # HTTPD_LANG=C
    HTTPD_LANG=en_US.UTF-8  # you can change to your locale.
    
    0 讨论(0)
提交回复
热议问题